We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Building Real World Cloud Apps with Windows Azure - Part 2

00:00

Formal Metadata

Title
Building Real World Cloud Apps with Windows Azure - Part 2
Title of Series
Number of Parts
150
Author
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
This two part talk will explore how to build real world cloud applications using Windows Azure. - Data Storage Options, - Data Storage Partitioning Approaches, - Using unstructured Blob storage, - Designing to survive failures, - Monitoring and Diagnostics, - Transient Fault Handling, - Distributed Caching, - Using the Queue Centric Work Pattern We’ll discuss each of the above cloud patterns in the talk, and then demonstrate how to really use them by walking through real code that shows how to leverage them within a Windows Azure application.
31
59
Thumbnail
1:00:41
89
Thumbnail
1:00:33
90
Thumbnail
1:00:33
102
AutomationPartition (number theory)Ring (mathematics)Control flowSource codeData storage deviceIdentity managementDisintegrationEnterprise architectureComputer configurationQueue (abstract data type)Pattern languageContinuous integrationData miningCache (computing)Sign (mathematics)Game controllerDomain nameTouch typingContext awarenessOrder (biology)Pattern languageData managementData storage deviceDifferent (Kate Ryan album)Single sign-onComputer configurationRevision controlTouchscreenMobile appElectronic visual displayInformationPasswordComputer programmingTraffic reportingIdentity managementDirectory serviceIntegrated development environmentCartesian coordinate systemBitType theoryEnterprise architectureControl flowServer (computing)CASE <Informatik>Term (mathematics)Scaling (geometry)Office suiteConnected spaceArithmetic meanFactory (trading post)Projective planeGroup actionRoboticsQuicksortSign (mathematics)MultiplicationOpen setPoint (geometry)Power (physics)WordXMLUMLComputer animation
Data storage devicePattern languageSystem programmingQuery languageDifferent (Kate Ryan album)Computer configurationScalabilityUsabilityRange (statistics)Computing platformService (economics)Web pageWindows AzureDatabaseTable (information)Gamma functionMaxima and minimaPort scannerRaw image formatType theoryRegulärer Ausdruck <Textverarbeitung>Scale (map)Channel capacityOperations researchCore dumpAxonometric projectionContinuous functionData recoveryWeightSoftware frameworkCodePhysical systemSynchronizationString (computer science)Power (physics)Repository (publishing)Category of beingSequelFacebookKey (cryptography)Gastropod shellTerm (mathematics)Sparse matrixScripting language1 (number)Directed graphInterface (computing)BlogQuery languageServer (computing)Type theoryElement (mathematics)Computer fileTask (computing)NumberSelf-organizationLatent heatQuicksortEndliche ModelltheorieData storage deviceDressing (medical)2 (number)Exception handlingRight angleObject (grammar)Order (biology)Computer programmingTable (information)Scaling (geometry)Service (economics)Cartesian coordinate systemOffice suiteTheory of relativityBitProduct (business)Multiplication signPhysical systemDatabaseRelational databaseComputer configurationVirtual machineSign (mathematics)Computing platformDirectory serviceRange (statistics)Row (database)WindowDifferent (Kate Ryan album)String (computer science)AutomationHigh availabilitySystem administratorIntegrated development environmentGoodness of fitBoss CorporationMobile appData structureMachine codePattern languageGroup actionLevel (video gaming)Variety (linguistics)Set (mathematics)Patch (Unix)Context awarenessPoint (geometry)Connected spaceCore dumpSoftware developerLibrary catalogBackupOperator (mathematics)VirtualizationCharacteristic polynomialLink (knot theory)MultiplicationImplementationComputer animation
Software frameworkWeightDatabaseCAN busRelational databaseWindows AzureVirtual realityVirtual machineSQL ServerNumberCompilation albumService (economics)Level (video gaming)Patch (Unix)MiniDiscLimit (category theory)Data managementIntegrated development environmentData compressionDatabaseSlide ruleMultiplication signCore dumpTable (information)Maxima and minimaProgramming paradigmVirtual machineServer (computing)Computer configurationDifferent (Kate Ryan album)Subject indexingRight angleStructural loadBit rateHigh availabilityService (economics)Software developerMobile appProjective planePatch (Unix)Theory of relativityMathematical analysisConstraint (mathematics)CASE <Informatik>Total S.A.Set (mathematics)SequelData storage deviceLimit (category theory)Equaliser (mathematics)DiagramProduct (business)Medical imagingPerspective (visual)Computer animation
Relational databaseDatabaseCAN busSQL ServerVirtual realityVirtual machineWindows AzureService (economics)Compilation albumLevel (video gaming)Control flowPatch (Unix)NumberLimit (category theory)MiniDiscIntegrated development environmentData compressionData managementPartition (number theory)Pattern languageScale (map)VolumeData storage deviceVelocityBit rateVariety (linguistics)Computer-generated imageryGraph (mathematics)ThumbnailDigital photographyNumbering scheme1 (number)VelocityVirtual machineVariety (linguistics)Physical systemMedical imagingDatabaseEndliche ModelltheorieVertex (graph theory)Set (mathematics)Different (Kate Ryan album)Pattern languageSource codePartition (number theory)Field (computer science)Digital photographyEmailContent (media)Patch (Unix)Volume (thermodynamics)BuildingEnterprise architectureCartesian coordinate systemFile systemMobile appMultiplication signNumbering schemeSpacetimeLink (knot theory)QuicksortData storage deviceSingle-precision floating-point formatTheory of relativityQuery languageHybrid computerVideo gameFacebookSoftware frameworkService (economics)Sheaf (mathematics)Order (biology)Hand fanBitRow (database)NumberScaling (geometry)Server (computing)Computer configurationSystem callLogicType theoryStrategy gameTerm (mathematics)Combinational logicReal numberTable (information)InformationComputer fileComputing platformElectric generatorCASE <Informatik>Slide ruleStorage area networkTime zoneHeegaard splittingWordSequelInsertion lossFlow separationSpring (hydrology)MassProteinFood energyTraffic reportingBlogVideoconferencingComputer programmingBit rateBinary fileGodRight angleComputer animation
Data storage deviceScalabilityJava appletFormal languageToken ringMaxima and minimaHost Identity ProtocolSynchronizationUniqueness quantificationClient (computing)Gamma functionString (computer science)Physical systemChi-squared distributionAerodynamicsDefault (computer science)Digital photographySoftware development kitFreewareExecution unitTrailServer (computing)Source codeQuicksortWeb applicationClient (computing)Data storage deviceMobile appOrder (biology)Web browserType theoryStreaming mediaReplication (computing)Object (grammar)Binary codeMedical imagingMultiplication signApplication service providerScripting languageWeb 2.0Electronic program guideComputer fileTask (computing)Machine codeDatabaseDifferent (Kate Ryan album)File systemService (economics)View (database)NumberUniform resource locatorDigital photographyRepresentational state transferVirtual machineGateway (telecommunications)Direction (geometry)Social classDatabase transactionContent (media)Band matrixDynamical system1 (number)AverageDirectory serviceCartesian coordinate systemExtension (kinesiology)Game controllerPhysical system.NET FrameworkRight angleBlogContext awarenessSystem callRandomizationWeb pageData miningCovering spaceMessage passingVideoconferencingComputer configurationProcess (computing)Goodness of fitWell-formed formulaMusical ensembleLocal ringGeometryFunction (mathematics)High availabilityEndliche ModelltheorieMathematicsComputer animation
Data storage devicePartition (number theory)Pattern languageAsynchronous Transfer ModeType theoryPressureService (economics)Entire functionInstance (computer science)Computing platformWindows AzureCloud computingInformation securityMeta elementMultiplication signPhysical systemService (economics)Different (Kate Ryan album)Enterprise architectureHigh availabilityData centerCloud computingMobile appSet (mathematics)Level (video gaming)BitIntegrated development environmentSoftwareDialectTerm (mathematics)Virtual machineType theoryBit rateError messageData storage deviceQuicksort9 (number)Perspective (visual)Entire functionComputing platformMachine codeMultiplicationDatabaseServer (computing)Statement (computer science)Natural numberBoundary value problemRight angleMathematicsBuildingWebsiteCASE <Informatik>Flow separationComputer configurationContent (media)Replication (computing)Reading (process)Exception handlingCartesian coordinate systemSoftware frameworkSequelSoftware developerArithmetic meanLaceAreaPoint (geometry)Internet service providerRAIDWordGoogolDenial-of-service attackComputer animation
Information securityPattern languageQueue (abstract data type)Service (economics)WebsiteFreewareSoftware development kitWeightCodeCuboidNumberGoodness of fitService (economics)DatabaseCartesian coordinate systemEmailDifferent (Kate Ryan album)Mobile appMultitier architectureOrder (biology)BitMultiplication signMoment (mathematics)System administratorInternetworkingEnterprise architectureBlogData storage deviceWindowSign (mathematics)Physical systemServer (computing)Connected spaceSlide ruleWeb 2.0Data managementSystem callMachine codeGroup actionTable (information)Pattern languageType theory.NET FrameworkWebsiteCache (computing)Game theorySurvival analysisAverageTerm (mathematics)1 (number)Integrated development environmentKey (cryptography)HeuristicDigital electronicsPlanningWeb applicationDataflowProfil (magazine)Network topologyTranslation (relic)Web pageFreewareBackupString (computer science)WordMassLimit (category theory)TelecommunicationMusical ensembleEvent horizonTraffic reportingArrow of timeComputer animation
Motion captureCodeImplementationSystem callLevel (video gaming)Annulus (mathematics)LoginBlogGroup actionMusical ensembleInformationBasis <Mathematik>Form (programming)Vector potentialCapability Maturity ModelEmailData storage deviceTable (information)Physical systemComputer fileEvent horizonFreewareService (economics)Directory serviceSynchronizationError messageInclusion mapOvalString (computer science)Uniform resource nameMenu (computing)Product (business)Drill commandsMultiplication signSystem callInterface (computing)Projective planeInformationLevel (video gaming)Data storage deviceComputer architectureCASE <Informatik>SequelService (economics)BitServer (computing)BlogMobile appRing (mathematics)Machine codeCloud computingLoginImplementationPoint (geometry)Right anglePhysical law2 (number)Task (computing)DatabaseInjektivitätComputer filePhysical systemLogicImage resolutionDigital photographyInternet service providerTwitterSocial classReal numberLine (geometry)Parameter (computer programming)Constructor (object-oriented programming)Software bugDemo (music)Scripting languageException handlingTable (information)Standard deviationArithmetic meanDefault (computer science)Extension (kinesiology)Figurate numberField (computer science)Covering spaceRootCausalityWebsiteError messageDirection (geometry)Representational state transferQuery languageTracing (software)Analytic setRepository (publishing)Application service provider.NET FrameworkMessage passingConfiguration spaceClient (computing)InternetworkingInclusion mapMotion captureComputer animation
Sanitary sewerCodeThomas KuhnLogicSystem callBlock (periodic table)ExpressionData storage deviceLibrary (computing)Software frameworkDatabaseExponential functionError messagePhysical lawEndliche ModelltheorieConfiguration spaceProduct (business)Scripting language2 (number)Wave packetWebsiteConnected spaceLoginData storage deviceString (computer science)Level (video gaming)Cartesian coordinate systemFile systemPhysical systemOrder (biology)InformationCombinational logicData managementMobile appServer (computing)Multiplication signSoftware frameworkDifferent (Kate Ryan album)Exception handlingLibrary (computing)Pattern languageMachine codeWritingType theoryClient (computing)Social classLogicBlock (periodic table)DatabaseObject (grammar)CASE <Informatik>Drop (liquid)Regular graphWeb 2.0LastteilungService (economics)Error messageRevision controlRadical (chemistry)Integrated development environmentPlastikkarteTheory of relativityStrategy gameQuicksortObject-relational mappingFood energyPoint (geometry)MultiplicationSequelDirected graphComputer animation
PressureMessage passingQueue (abstract data type)Pattern languageBefehlsprozessorBit rateThresholding (image processing)Denial-of-service attackPattern languageService (economics)Priority queueWeb applicationVariety (linguistics)Different (Kate Ryan album)Mobile appDatabase2 (number)Multiplication signWeb 2.0QuicksortLogicMaxima and minimaServer (computing)Mobile WebError messageMultitier architectureType theoryComputer animation
XMLUML
Transcript: English(auto-generated)
It's always good to be the session between everyone and lunch. Okay, we'll go ahead and get started then. I think people are still trickling in. For people that were here in the previous talk, this is a part two of a two-part talk. If you weren't there in the previous talk, don't worry.
You missed some stuff. We're going to go ahead and pick it up where we left off, but there's not a ton of context that you're missing. But basically with this talk, we're walking through 13 cloud patterns that you can apply within your projects in order to build apps in the cloud that are robust, that can be delivered quickly and reliably,
and in particular what we're going to talk a lot about in part two of this talk that basically can scale very well and can be designed to survive failures and basically how do you build robust applications that are resilient to the types of things that happen in any environment, but in particular in cloud environments where things like connections dropping and other things tend to happen more often.
And I'm going to talk about a bunch of different patterns that you can use both on the scaling and the reliability side that you can use in order to build even more robust solutions inside your app. And then before I get to that, though, I'm going to finish up the one thing I didn't actually get a chance to cover in part one,
which is to talk a little bit about data storage options and touch on that. And then we'll continue and pick up from where we are in part two. So in terms of data storage, I walked through identity in the previous section, and a couple of questions people asked me during the break were, Hey, can I enable single sign-on not just with my enterprise but with other enterprises as well
so I could build like a multi-tenant solution where I can enable multiple customers to do sign-in? The answer is yes, we do support that. We do support, again, SAML, OAuth, and WS-Fed endpoints to program against that directory. And someone also asked me, Hey, does Microsoft use this directory? You know, storing stuff in the cloud,
how do I convince people that anyone can do it? Are you guys really doing it yourselves? The answer is yes. And so I'm trying to find something that doesn't have too much confidential information that I go ahead and display across the screen. But you'll notice this sign-in looks very similar to the one that I did with our little FixIt app.
This is the Office 365 sign-in. In this case here, Microsoft enables what's called ADFS. So as part of signing in, it'll go ahead and actually direct me to a sign-in server actually at Microsoft, but it's again going through the directory in the cloud. So in other words, the Microsoft domain control is the only one
that stores the password instead of it being hashed. And now this is an internal customer app with a deck I shouldn't open that is hosted in the cloud, and I'm using integrated single sign-in to do it. So we actually put all of our secret documents and source control and performance management and sales reports up in the cloud now
that are using this exact same single sign-on solution that I showed in part one in order to actually secure it. So yeah, we do trust it ourselves. If your customer is an Office 365 customer or signed up for Office 365, before you deploy, you actually set up a cloud directory in Azure, so it's part of the actual Office 365 setup.
And so if your company is moving to Office 365, the good news is your directory is already going to be integrated in Azure, and so integrating within your apps, like I showed in part one, you can do as well. So anyway, that's a little bit on the directory side that people have asked questions. So let's pick it up and talk about data storage. So we have single sign-on. How do we actually store our data now in the cloud,
and what are the different options that we can use in order to do it? Well, with Azure, and generally in cloud in general, there's usually a wide range of options. I talk to a lot of people for whom data storage means a SQL database. We have SQL databases, so the good news is you can continue doing that inside Azure. But one of the things that you can also do inside the cloud is leverage other types of data solutions as well,
things like NoSQL or unstructured blob storage or even solutions like Hadoop or MapReduce that layer on top of it and provide a different way to think about querying and interacting with data. Generally, I'd say no one solution meets everything.
There's definite benefits to the relational model. That's why it's been around so often. But there's also downsides, and there's definite benefits to NoSQL. That's one of the reasons why you hear it used as much, but there's also downsides. So often what I see with a lot of solutions is a compositional approach, where you use both relational and NoSQL
and unstructured in a single solution. And one nice thing about the cloud environment is it makes it really easy for you to actually have multiple data solutions and integrate them in a single app. And because you're not often having to manage all of the operational aspects of them, it's not as costly for you to do it.
In terms of specific data solutions on Windows Azure, we have three built-in solutions today that provide a relational service, or a relational database we call SQL databases, a table service, which is a NoSQL key value pattern-based solution that you can use to store hundreds of terabytes
in kind of a wide, sparse column-based approach, and a blob storage system, which you can logically think of as unstructured files in the cloud. And what's nice about these three options is they're provided as part of what we call a platform as a service model, which means that you don't have to create virtual machines in order to use them. Instead, you can just go to the portal
or use our PowerShell tools like I showed in part one and just say, I'd like me a 100 terabyte database, a 100 terabyte table store, and in 15 or 20 seconds, we'll create it for you. Likewise, you can create a storage account and you can start uploading gigabytes of blobs and files, or you can go ahead and create a database. And all that's done for you, we patch them,
we handle high availability for you. You don't have to do any of that yourself, and you don't have to buy any licenses to use everything on the left-hand side. So you don't actually have to buy a SQL license up front if you use our SQL database service. You just pay for what you use and the hours that you use it.
Alternatively, we also support with Windows Azure now what we call virtual machines, or infrastructure as a service, and you can spin up a Windows or Linux virtual machine and put anything you want in it. So if you want to use SQL Server, the same one you have on-premise, you can spin up virtual machines and deploy it. If you want to do MySQL or Postgres or RavenDB or Mongo or Couch or Redis or Neo4j or React, et cetera,
we have all those solutions. You can basically run anything that you can run in a virtual machine on Azure. So you have the full wide variety of basically being able to choose any kind of data solution and take advantage of it. There is no one right approach.
So anyone that tells you this technology is the answer, the first thing to ask is what is the question rather than what's the answer because different solutions, again, are optimized for different things. And even when you hear someone say we're embracing NoSQL, often when you actually drill into what they're doing,
they're not just using one NoSQL solution. They're actually using sometimes Mongo and Couch and React or Redis for different things. And so even someone like Facebook that uses NoSQL in a very deep way, they actually have multiple NoSQL implementations that they actually use for different parts of the service. These are a couple questions that are worth just asking as you start thinking about data storage.
What's the semantic? Do you want relational or do you want something more unstructured? What type of queries are you gonna run against this? Things like a table store with key value pairs are exceptionally good at storing single rows that you can look up and that you can look at properties. They're usually not very good at saying
find me all rows that have this characteristic. They're instead very, very good at give me an individual row very fast and scale that out. So they're very good at, say, a per user personalization store. They wouldn't be very good as a product catalog store where you often want to be able to search and do complex queries across them.
Other things just in terms of how well can you scale it? Relational models and relational databases tend to inherently have scaling points that you hit that make it very difficult to have many terabyte relational databases that you can update in a cloud environment. But they're very good at other things.
And so just understanding, again, where you want to use them, these sets of questions hopefully provide some things to think about. And what I generally recommend is know the answer to each of these different categories before you pick the solution that you want to use. In this application, we're going to use multiple. We're going to use a relational database with SQL, and we're also going to use blob storage.
I'll talk more about why we're doing both in a second. But what I thought I'd do is just spend a few minutes here just showing how we're using a SQL database up front. So again, in the PowerShell script earlier, I went ahead and as part of the PowerShell script, created a SQL database server and instantiated a SQL database as part of it.
It's somewhere in here. I think it's this one right here that we created. Just to show off how you can easily create a SQL database, I'm going to use the portal. It's automated in production, but it's good for spiking. What would you like to call this database? Silent DB, how about? We can choose where we want to run it.
So we'll run it there. How long does it usually take you when you ask your system administrator to stand up a database for your app? A week. A week. Can anyone beat a week? A day. Okay, there we go. He's got a day. How many people usually take more than a week? How many people have to ask their boss before they're allowed to buy a SQL server license?
So what you can do inside Azure, which is kind of cool, and the developers love this, DBAs cry, is you go and you say, new database, and you hit OK, and you count. One, two, done. So I beat your day. I can do it in two seconds.
And so now I have a silent database running inside Azure, ready for me to use. And this is an example of our platform as a service model where you don't have to actually manage this. You don't have to do backups. We do it. It's running high availability, so we have three logical or physical servers that the data is replicated across. So if a machine blows up, we'll automatically fail over.
And the cool thing is, I can go ahead and very easily grab my connection strings, plug it into ADO.net, and I can start using it. And what's nice is the pricing model of this, because I haven't stored any data in it. The pricing model starts at, I think, five US dollars per month,
which I think would probably be about 25 kroner per month. And then you basically pay based on the database usage or the amount of data you stored in it. And so you don't have to buy a license. You're not actually having to go buy a SQL license. I can create any number of these that I want, and I can start putting data in it. Right now, this database is capped at one gig.
If I wanted to, say, store 150 gigs in it, go ahead, do that, hit save. And again, about three seconds, I now have a 150 gig database that I can now deploy things into. And so that's kind of the power of the cloud in terms of being able to stand up this infrastructure and really be able to start using it. And for our simple Fixit app,
I'm basically using a database just like this. I actually have two databases, one for my membership and one for my data. And basically, that's all I had to do in order to provision it. And again, I'm doing that through the PowerShell scripts, but you hopefully saw in the portal how easy it is to also UI drive it. In terms of accessing this,
then what I'm using is EF code first, the new EF6 that has the async support, among other things. I basically have a Fixit context that I created, which just derives from the DB context. This is the connection string I'm using, and I basically then just have a Fixit tasks table that I'm storing my data in. And each Fixit task right now is sort of a POCO model,
so just plain old CLR object, where I have a couple different properties here that I'm using. And then what I basically just built is a repository interface on top of this for some finder methods and then some create, update, and delete methods. Notice they're all async. So I can basically do all my data access in a completely async way.
And then this is basically the code for me to go find and work with this. You can sort of see I can use link queries, and I'm basically just querying that database and using it within the application. You'll notice there's some stopwatch timer and logging stuff. We're going to talk about that in a few minutes. That's basically the next steps in terms of how I'm using the SQL database
and programming against it. And the beauty is the core programming model is consistent across SQL Server on-premise and SQL Server in the cloud using the SQL database option, and so I can reuse most of the same set of skills that I have. Now, there are some differences, and this slide actually talks about
the differences between using our SQL database as a service, which is the thing I just showed and which I'm using in the app, and the other option that we also support, which is SQL Server in a virtual machine. On the right-hand side. So you could do either one of those inside Azure. I'm using the one on the left for this app. What I try to do in this diagram, though,
is talk about the pros and cons of each because there are some differences between the two that you want to be aware of and then think about as you kind of choose your database solution. The benefit of the solution I'm showing you and using here is it's running as a service, so I as a developer don't have to manage it. I don't have to patch it. I don't have to buy a license, and I pay only for what I use. And so the overall total cost of ownership
of the solution tends to be much smaller, and it's in general pretty good at having a lot of small databases. I define small as something less than 100 gigabytes, and you can technically go up to 150 gigabytes,
but in general I'd say the smaller the database size, the better the use case works. As you start getting close to the 150 gigabyte limit, you will start to see things like indexes and things also start from a constraint perspective. But if you think your solution's only going to need a couple gigabytes of relational storage,
it should be able to handle that size just fine. It doesn't have all the features that the on-premise SQL Server has, so it doesn't have CLR Sprocs. It doesn't have the analysis services features and some of the other capabilities. And the recommended database size, again, the max database size is 150 gigs.
From a performance perspective, our guidance is we recommend having individual tables no more than 10 gigs, and that will make sure that you don't spend a lot of time on the indexing side running the performance issues. Alternatively, you can just spin up virtual machines and run SQL in them. This allows you to use the same SQL Server
that you run on-premises in the cloud, so it has all the features that you use on-premise. We do support the ability to use Always On, which is a feature that's in SQL 2012 and lets you run active-active SQL databases. High availability is built into the database as a service feature, but with Always On, you can do that as well with a full SQL Server.
That does require, though, at least two virtual machines, so it does mean you need to have two virtual machines, but that is an option that you can do. And we've just recently, this month, are shipping support so that we integrate Always On with our load balancers, so it composes very well with our networking stack. The option on the right also means
you can reuse your existing SQL licenses, so if you've already bought a license, you can just deploy virtual machine and use it. We also do have the ability where you can actually effectively rent SQL by the hour inside Azure, and so if you go into the portal and say New Virtual Machine SQL as an option, so I go say New Virtual Machine
and use one of the SQL images, we'll actually charge you both the VM cost and then a prorated hourly rate of SQL on top of that. And so this is really good if you have a project that's only gonna run for a couple months. It's cheaper to repay it by the hour. If you think this project is gonna last for years, it's cheaper just to buy the license
the way you normally do and then stick it inside the virtual machine, but that's just something to know that's an option. And the downside, though, is you end up having to patch the virtual machine and manage the SQL database yourself. If you already have DBAs that do it, then they already know how to do that. But anyway, those are some of the pros and cons,
and there's sort of a white paper I'll link to from the slides that actually goes into more detail on it. But anyway, so for this option here, we're gonna use the SQL database platform as a service feature, and as you saw, I'm using Entity Framework in order to program against it. So one of the things about cloud apps that's worth thinking about up front, ideally,
is not only what data options do you wanna use from a querying size capability, but also ultimately what is the velocity of data that you're gonna generate look like that you're gonna put into that solution. And so one of the things I recommend is always asking yourself up front the three V's questions. What is the volume of data you're gonna store?
Is it just a couple gigabytes, or could it be a couple hundred gigabytes? Or could it be terabytes, or could it be petabytes? It's worth knowing up front what that number might be as you're building your application. It's a lot easier to design for it up front than it is to react to it later.
Don't know what that means. Oh, okay, okay. Hopefully there's not an airstrike going on. This would be the time to do it, and the next section's on surviving failure, so we'll make it resilient to airstrikes as well.
What's the velocity of the data that's gonna come in? This is also really important to understand. If it's just an internal app with just employees facing it, you often aren't gonna be generating data at a huge rate, but if it's customer-facing and they start uploading lots of images or lots of fix-its into our fix-it app,
if it starts growing exponentially in terms of size, you really need to make sure that you think about your data strategy up front to handle it. And then what's the variety? Are these images, are they videos, are they relational data? Again, understanding exactly what type of data you're gonna store in there will help. And again, the key thing to really understand for all these three questions
is know those answers before you start your app, because if you think you're gonna have a lot of volume or a lot of variety or a lot of velocity on your data, you really wanna design up front with a partitioning scheme that's gonna let you scale out as your app grows to make sure that you don't have a single set of bottlenecks. And there's a couple different ways you can do it.
And so there's two patterns and sort of a hybrid pattern I'll walk through. One's called vertical partitioning, one's called horizontal partitioning. And in the vertical partitioning model here, if you think about, let's say, a simple set of, say, emails or fix-it issues, coming from a SQL world, I might think of this as a table that has a bunch of columns in it.
And I could have text columns and I might upload my photos as part of this as well. Anyone ever upload and save files or photos inside a database? This works, again, for small amounts of data. And for on-premises, this is what a lot of people do because they don't wanna save it to the file system because then they have to manage backing it up
and they might not have a SAN or some kind of file system that's able to support the amount of data they're having. It's just easier to, say, let the DBA do it. But if you're in the cloud and you have an app which might store hundreds of gigabytes of data, this is gonna become a real scaling bottleneck if you store all that content in there. And so one approach you can do is this thing we call vertical partitioning,
where instead of storing it all in one SQL database, you can partition up the data to be across multiple databases based on columns within it. And so I could say, hey, I'm gonna store all these fields on the left-hand side in one database. I can store the fields on the right-hand side, potentially, in another database. Or since these are binary files,
I might use a blob system. And that's a place where, as you're partitioning up, you can actually choose what's the most appropriate data source to use for each of the different vertical partitions. And the benefit here now is, for my fixit app, I can store 150 gigabytes of fixit information. If I'm just storing a few text fields for each item,
that translates into, I don't know how many, but probably billions of different fixits. And if I use the blob system for all the images, which is where the real volume of the data is really gonna be persisted, I don't have to worry about my SQL database becoming the bottleneck, because I've taken all the big files and I'm sticking them somewhere else. That's an example I call vertical partitioning.
Horizontal partitioning, which you might often have heard of as sharding, is a slightly different take. Instead of partitioning vertically, we're gonna do it horizontally so that individual rows potentially get persisted onto separate data stores. And so you can use an approach, in this case here, this is a very naive, simple approach, which is the first letter of the last name, stick it on different database servers.
And so when someone comes to your app and says, my name is Scott Guthrie, you could say, oh, your record is on database G, I'm gonna go ahead and query it that way. In general, you wanna be very careful with your sharding strategy. It turns out a lot of people have both their first name and last name start with common letters. So if you pick the wrong sharding strategy,
and there's not many people in the world with the letter Z for their first name, then you end up having still hot spots within your application. But if you hash it correctly, this is a nice way that you can actually now take lots and lots of small databases and split up tons of data across them in a very clean way.
Little bit trickier if you wanna query the database, and if you need to query all the data at the same time, you have to think about fan out queries and things like that. But again, if you think you're gonna end up having lots and lots of data, this provides a way that you can do it without having to get lots of big machines in order to run it. And then what a lot of people end up doing
is a combination of these two, where I might vertically partition and stick images and things in other places, and then use sharding for my rows as well. And this kind of, again, allows you now to basically, if you do it right and up front, you could potentially have petabytes of data being stored in the cloud without having to actually invest in any massive machine.
You can instead have lots and lots of small and medium machines that you're spraying it across. And again, you can grow it as your app grows and you're not actually having to buy a lot of up front infrastructure in order to do it. So, let's actually, and the last thing I'll just mention about partitioning, it's much, much easier to do this up front
than when your app really gets big. I'd say on average, about once a month, we get panicked phone calls from customers whose apps are taking off in a really big way, and they're going, oh my god, I'm storing everything in a single data store, and I have about 45 days until I run out of space on it, can you help me? And if you've got a lot of custom business logic
that's built against that data store, and you have customers that are constantly using the app, there's no good time to just say, I'm going to go down for a day while we migrate. And so we end up going through a lot of Herculean efforts to help the customer re-partition their data on the fly with zero downtime. And it's very exciting and very scary as we do it. And so, designing this and thinking about this up front,
if you think your app's going to be the next Facebook is really important. But even if you think your enterprise app might just have more take-up in the first year than you thought, designing this up front will make your life a lot easier. So I'm going to walk through how to do this with something called Blob Storage, because I'm actually using this within my application. And I'm taking advantage of the service that we have
inside Azure we call the Blob Storage Service. It's, again, sort of a highly scalable, durable, think of it, file system in the sky. And what it basically lets you do is store any type of binary object inside there. So it could be images, it could be movies, it could be videos, it could be anything. And what's nice about it is we do all of the storage
and replication for you, so you don't have to worry about backing anything up or running any virtual machines. Instead, we give you a nice REST API or .NET, Ruby Java API, and you can effectively say save and retrieve things from that Blob Store at will, and we'll handle it for you. You can also then optionally make any of the items you store in the Blob Store available to the public
so that anyone from a browser or from a mobile device can directly download it from the Blob Store as well. By default, everything's secure, but again, you can grant access, either blanket or you can even do temporary leases to individual users so they can get access to it. And this avoids you having to have someone hit your web server and the web server
having to hit the file system, grab the data, and stream it out. So instead of having that kind of gateway model, they can access it directly, which means you need a lot fewer web servers and it makes it much more efficient. This is some simple code. Again, you can go into more detail later on, you can look at it, but it just shows programmatically everything I need to do in order to create a Blob Store
and actually upload photos into it. And so this is a simple method called create and configure, which I'd run as part of app startup. Basically, all I do is I create what's called a Cloud Blob client object, and I'm going to say here I want to find a container called images. Think of this logically, maybe like a directory in some ways. And I'm just going to create it if it doesn't exist,
and if it doesn't exist, I'm going to go ahead and set the permissions to make it public so that browsers can access it directly. And so I can just run this script as part of application start inside ASP.NET, and now I have a Blob container in the cloud. And then what I can do is every time someone uploads a new fixit where they attach a photo,
what I'm basically doing in my MVC controller is calling this upload photo method where I pass in the HTTP posted file base, which is just the ASP.NET object that encapsulates the uploaded file, and I basically connect to that same images thing I then create a dynamic file name on the fly, I'm just creating a new GUID,
and I just use the file name extension that was uploaded, so it was a JPEG, I make it some random GUID.JPEG. And then basically this is the code right here to upload it into the cloud and set the MIME type on it. As you can see, the simple method upload from stream, it just takes an Iostream, a standard .NET Iostream, and then I just basically retrieve back the URL
to this uploaded file and return it back from this method. And now I can just store this inside my database. And so I can just reference this directly within my class. And so if you notice here my fixit tasks, what I'm storing in my database is that photo URL.
So the only thing I'm storing in the database now is a URL to the blob that was just uploaded. If we go to the task controller, this is the create method that gets called when I post a new fixit and I upload an image. And all I'm doing here is just calling this photoservice.uploadphoto,
which is the code I just walked through, and you can see it right here. And again, all it basically does is talk to the blob store, get a reference to the image or this is on startup. All it basically does is gets a reference to the images thing, creates a dynamic URL,
and then basically uploads it into blob storage, and then gets back a URL for it. And if I run this app locally, I'll quickly upload a photo,
and it'll say create a new fixit, do something, please, send it to me, and I'm just going to pick a photo here, pictures of my kids,
many folders, many photos, but not the ones I'm looking for. Somewhere here, there's something that says fixit photos. Fixit pictures, there it is, okay. Let's not pick my messy clothes.
Anyway, what this is doing right now is I basically called this method in my fixit, my task controller, called this upload photo, it got back the URL, it set it on this entity context, and I persisted that into the database. And under the covers, if I go ahead and connect, I think right now I'm using this as my,
I think you'll notice this is a simple blob explorer tool that you can get. There's my images, and I probably saved it in a different account, but anyway, somewhere in here is the picture of one of the images I just uploaded. If I double click it, it's downloading it,
and that wasn't the one I just uploaded, but you get an idea here. I unfortunately don't know what the random GUID was, the one we just created, but this is being stored in the blob storage system. The database, again, just has the URL to it, and what's cool is if I go ahead and look at the fixits assigned to me, and this is the one we just created, and hit details,
this will actually display the photo, and all we did to do this inside our app is if you look in the dashboard details view, all I need to do is just, if there's a photo URL, I'll put an image tag with the URL, and if you look at the view source on this page,
one of the things you'll notice is here is where this URL actually lives, and notice it's actually not coming from my web server. It's instead coming directly from my blob storage account, so I didn't have to actually have my web server stream this picture out. Instead, I could just use my web app
to upload it to blob storage, get the URL back to it, save it in the database, and now when a browser hits it, they can hit the image directly. So I don't have to build out a huge farm of servers in order to dish out these images. I can just leverage the fact that blob storage does this for me automatically. And the beauty about blob storage, again, is you only pay for what you use, so right now I'm probably paying less than 12 cents per month
in order to store all these images, because I think roughly 9 cents per gigabyte is what we charge, and it's an average for the month, and then you pay a small per transaction fee. And the beauty about each blob storage account that you create is you can store up to 100 terabytes of content within it,
and you can have any number of storage accounts that you want to create. And so now this application, again, it's a very simple application, or small application, but it now actually has the bandwidth where if I wanted to, I could literally have a million customers storing fix-its in here, and I could actually scale to pretty much any number of fix-its
and images being stored within the application. And as you saw, it wasn't much code in order to enable it, and I can leverage the fact that the blob system provides a high available service for me to use. One other nice thing about the blob storage system is it supports a feature we call geo-replication, and so what that does is, and you can turn it off if you don't want it,
but it will basically say anything you store in the storage system in, say, North Europe will automatically be replicated to an account in West Europe as well. And so that way, if an error rate did happen for real and obliterated wherever our data center was, your data would actually be at least 500 miles away somewhere else,
and so we might not be alive, but our data would still exist. And hopefully error rates don't happen, but natural disasters do happen, flooding, hurricanes, et cetera, and so having this option to have the storage system automatically replicate your content in the background
is something that's really nice, and especially as your data grows, being able to replicate petabytes of data or gigabytes or terabytes of data ends up getting more and more complicated, and we take care of that for you. We also make sure that your data never leaves the EU, so if you deploy inside our North Europe data center, we'll make sure that we replicate only to the West Europe data center region.
We never replicate across continents or across geopolitical boundaries. So that was a little bit on data and data partitioning. Let's now talk a little bit about designing for failures. We talked about what would we do in an error rate. One thing you want to think about as you build any real application,
but especially one in the cloud where lots of people are going to hit, is how do you basically design and architect it so that it can survive failures and be very robust? Given enough time, lots of different things are going to go wrong in any environment or any software system, and a lot depends on how your apps actually behave and handle them,
as to whether or not your users are upset or whether you're spending a lot of time in the middle of the night trying to figure out what went wrong and how to fix it. I'm going to talk about a bunch of techniques over the next couple slides that you can use both from a meta perspective and then from an architecture perspective to design for it. A couple of things to think about with failures is there's lots of different types of failures.
Machines will die, so individual servers will go down. Part of the goal with the cloud is you should be resilient to that, so if you have things running high availability and a machine goes down, it's okay. Another server spins up and takes its place in a well-architected cloud app. The platform as a service features that I've been walking through, like our websites and our SQL databases,
are designed to do that for you automatically, so you don't have to write code to handle that. You will also, though, see cases where whole services fail. So what if our database service fails that's hosting our database? How do I handle that? You want to start thinking about, as you use more services within your cloud apps, what happens when that service is having a problem and is unavailable?
And we'll talk more about that in a second. And then occasionally you will have whole regions fail. So in other words, there could be a natural disaster in the North Europe region or in the West Europe region or the East US region. If your app is running in it, what do you do? And so it's also possible, and I'm not going to go into really covering it in this talk because it's almost an entire separate set of talks,
which is the ability to run your app in multiple regions simultaneously so that even if there was a disaster in one region, you could have zero downtime. Most enterprise apps, and in fact many consumer-facing apps, unless they're a certain size, don't actually do that. But it is possible to architect in that way,
and it's a good best practice if you can. Part of our goal with Azure is to make it easy to handle all these failure scenarios, including region-wide failures, a lot easier. People often hear about SLAs, service level agreements, and basically these are kind of promises that companies make
when you're using one of their services about how much downtime that they'll have. So they'll guarantee you might hear 99.9% uptime. One thing that's important to understand is, what does a 99.9% SLA actually mean? And one of the things that's also worth asking when someone tells you they have a 99.9 SLA, what's the time frame?
Is it per month or is it per year or is it per week? And what's exactly the definition of being down? But these are sort of some examples when you hear SLAs ranging from one nine to seven nines or six nines. What does it mean from a downtime perspective per year, per month, and per week? 99.9 means that technically you can be,
that service can be down 8.76 hours per year, 43 minutes per month, and 10 minutes per week. That's more downtime than most people realize. And so again, as a developer, if you're using a service that says, hey, I might be down for 10 minutes a week, you really want to make sure you design your app to handle that in a graceful way,
because at some point, one of your customers is going to be on the app and the service that you depend on is probably going to be down, and you want to be able to react to it. We provide a bunch of SLAs inside Azure. These are monthly SLAs, so we reset the clock every month, so we don't just count them up per year. We always try to aspire to do better than the SLA. So these are, you shouldn't say, don't read this as,
oh, Azure says we're going to be down 10 minutes a month for your website. No, what we're trying to say is this is actually, if we are ever down below that, we actually give you money back, or you can ask for money back. And so there's sort of an enforcement policy as part of that. Usually the money we give you back never compensates you
for the business impact of the app being down, so that's not an excuse, but it at least lets you know that there is sort of enforcement in this, and we do take it very seriously. One important thing when you think about SLAs is how do they add up and they compose? So a simple example to make this real,
we're using a website, a blob storage system, and our SQL database right now. Each of these has an SLA. What is our app SLA now when we actually use all three of these services? Anyone guess? The smallest one? Oh, someone's pretty good at math.
It's actually not the smallest one. A lot of people assume it's the least of the three, but that would only be true if every time there was a failure, they all decided to synchronize their failures at the same time. And that rarely happens in the real world, and so what you actually need to think about is if they all fail at different times during that month, your actual composite SLA would be about 99.75,
which is about an hour and a half of downtime per week, I think, or per month. I'm going to double check. Anyway, it gives you a sense of, wow, that's more downtime than I'd actually want to be able to take. Again, the aspiration is 100% on all the reliability,
but this basically just means as you're building a cloud app, I need to think in mind that I might have some of these things having problems. How do I handle them as a developer? The good news is there's techniques you can use to protect yourself against that, and that's what we're going to spend the rest of the time talking about. Again, these SLAs, Azure actually provides, I think, the best cloud SLA of any cloud provider out there,
so don't take this as, wow, Azure seems flaky. This is actually the best you can get in terms of the SLA that we're offering compared to, say, Amazon or Google. This is more a statement of, as a developer, wow, I really need to be thinking about how my app handles failures because this might happen more often than you think. Another thing people often say is, wow, in my enterprise app,
we never have these problems. And I usually say, well, how much downtime a month do you have? And they say, well, you know, it happens occasionally. And I go, okay, well, how occasionally? Well, only if we need to back up or we're installing a new server or doing downtime. And I say, okay, that counts as downtime. So typically, most enterprise solutions and less mission critical actually often are down for more than this.
But when it's your server and you're managing it, emotionally you feel much less angst when it goes down because it's your problem. In a cloud environment, you often find psychologically, you feel like, well, I'm using someone else's service and it's down. I don't know what's going on, and it can be a little bit scarier.
So a couple patterns you can use, and we're going to walk through all these in terms of kind of surviving these types of things, is having good telemetry monitoring, handling transient faults, enabling loose coupling, and using caching. And then this last one I'm going to call circuit breakers, which is making sure you take all these and when you see there's a problem, how do you instrument your app so that you automatically adapt to it without you having to be paged
and actually do some manual work in order to make it work. So let's talk about monitoring and telemetry. How many people here have really good monitoring infrastructure in the apps that they deploy on premise? One person. Great. Or three. Okay, good. How many people rely on their customers to tell them
whether they're internal or external when their app is down? Okay, there's probably more, but you're all cowards and don't want to raise your hand. But yeah, a lot of people rely on, hey, send me an email when it's down. That's not really a good best practice anywhere, but especially in the cloud, it's not a good best practice. One of the things I'm going to try to do is walk you through
how easy it is now in the cloud to actually have a really great monitoring solution that you can use. Is Frogger, do they have Frogger here? I don't know this translation, but the way I can have you also, I actually stole this slide from someone else, but the way to think about telemetry and monitoring is if you're playing the game Frogger,
you can see everything that's moving around, you have a much better chance of surviving. If you don't have any telemetry, you're just sort of in the dark, and when someone says it had a problem, and you say, what did the problem have? And they say, I couldn't hit it, or it showed an error. What was the error? I don't remember. Okay, then how do you find in a giant code base what the problem was, and how do you figure it out?
With a good telemetry system, you know at least what car ran into you so that you can prepare for it the next time, and there's a couple different ways to do it. One kind of approach, and I'll walk through how you can do custom logging in the next pattern, but one thing I actually always try to tell people that's nice about the cloud versus, say, an on-premises environment
is it's really easy to either buy or rent your way to victory. You don't actually have to write a bunch of code in order to get a pretty good monitoring telemetry system out of the box. There's a bunch of great partners that integrate with Windows Azure. These are just some of them that enable you to, with very little effort, actually get a really good telemetry system
up and running really cost-effectively. Many of these guys have free tiers, so you can actually get basic telemetry for nothing forever within your application. Let's go ahead, and Scott Hansen has a cool blog post you'll find in the slides that you can read in order to actually learn more about how to set this up.
I'm just going to walk you through really quickly how to use one of them, which is New Relic, in order to enable this. What I'm going to do here is simply log into our app just to get a little bit of telemetry data in the system, and I'm just going to say, hey, let's go ahead and another photo, blah, sign it to me,
and we'll just upload something. Ah, we still find the fix-it pictures. There we go. More messy clothes. It's there. I can see what's assigned to me. We can edit it. We can see details. Anyway, I'm just hitting the site a couple times to warm it up
and let some of the telemetry data flow. So if I want to then see as an administrator what's going on in the system, I can take advantage of something like New Relic in order to do it. All I need to do is I go into the portal
and I'm just going to refresh the dashboard, and what I can do with New Relic is my internet connection is still available is I can very easily go and use what's called the Windows Azure Store in order to sign up for this New Relic service and then all I need to do is
inside my app, I can install a NuGet package for New Relic and register a license key and then my telemetry will actually start flowing from my application and is available for me to use. So all I need to do to sign up is I can go New Store and we have a way that you can actually
consume services from third parties. New Relic is one of those. So just go ahead and click New. There's a free tier and then you can actually pay by the number of servers that you actually use, but the free tier is free forever. So if you go ahead and click on that, it'll show up in the portal here. I have a nice New Relic service. Basically, all you need to do is grab the connection string.
Don't look at that since it's mine. Paste it into your app and you just run a NuGet package to install the New Relic agent into your web app, deploy it into the cloud, and then as soon as you do that and start running on it, you just go ahead back to the portal inside Azure and click Manage. And we'll single sign on you into the New Relic management portal.
And what you can see here is my Fixit app in Europe that we just hit a few moments ago is now uploading telemetry to New Relic and I can actually start to see how many requests am I doing. You can see some of them showing up here. I can see how much throughput I'm doing on the server and I can actually see how much time in milliseconds
each of the individual action methods within my app are taking. And if I drill in to individual ones, I can even go ahead and see what am I actually doing, what are the methods that are being called. It's using the .NET profiler, so I can actually even see time inside individual methods, where are the hot spots within it, what's the historical performance.
If I want to look at external services, so for example, like databases or blob storage, I can actually pull this up and you can see here I'm using this blob storage service and it's actually, right now I've only hit it a few times, so it unfortunately doesn't have a ton of data, but it'll actually show you how many calls my app made
and what the availability looks like of that external service. And I can set up various reports or events, so I can say any time it starts dropping or throwing errors, send me an SMS or an email so that I actually know that I have a problem. Again, the beauty about this is very quick, and I can even see where are people accessing this app. And so you can see here
the different parts of my app. I've got my web app, people are hitting it from, that's the latency, here's the latency on average between my blob and my table storage, which we'll talk about in a little bit. Anyway, you get all this effectively without having to write any code. And so the beauty here is use the cloud, install the new relic agent
or one of the other ones I mentioned, and without having to do really anything, you suddenly are getting a lot more heuristics about how your solution is being used and what the customers that are actually using it are actually seeing. And again, if you set up for the notifications, you'll actually start being able to get told when there's problems.
One thing you want to also then think about is what are you going to use all this information for? So the nice thing about one of these things I've shown you is it at least gives you a high-level overview of what's going on in your solution. And so you can at least know when there's a problem and you can know what the customers are seeing. But when you want to then drill in and say, okay, what was the real code reason
or what was the real architectural reason, you typically want to instrument your code to get even more insights into that. And that's what we're going to walk through in the next little bits here. Generally, one good best practice I highly recommend is every time your app talks to any service, whether it's a database, whether it's blob storage,
whether it's a REST API call that's done by another provider, is make sure you're monitoring and logging it. And so measure not only did it succeed, but also how long did it take? Because often what you see in a services-based world is it's not that the thing isn't available, it's just the thing that used to take 10 milliseconds occasionally takes a second to return.
And if your app assumes it always takes 10 milliseconds, you start to run into problems. When people say, my app is slow, you want to be able to very quickly look at something like New Relic and say, oh, you're right, I saw some spikes, and then be able to use your logging infrastructure to dive even deeper to understand exactly what call inside the service actually caused the problem.
So one way you can do this is by instrumenting your code with logging. And so one of the things you might have noticed when I showed some of the code a little bit earlier is I had these stopwatches and these different logging things throughout here that I was using and instrumenting within my code. And basically, what I recommend doing
is when you create a production project, create a simple iLogger interface and stick some methods in it. This gives you the ability where at a later point you can change your logging implementation however you want and not have to rip it out throughout your code. So you could use system diagnostics inside your .NET app directly. What I'm doing here is create the iLogger interface,
and then my logger implementation is just using system diagnostics behind the cover. But that way, if I ever want to make it richer, I could replace system diagnostics with any other custom logic I want. And then every time that I make a call, or every time that I have an exception, log the exception details to that logger interface with the exception and the Internet exception included.
And then every time that, say, for example, I call my fixit task, just make sure that I actually click a stopwatch before I actually call that method, take the time afterwards, and I just write it to the log. And so you can see here I'm basically inside my code in just a few lines of code. Every time I make a call to a database, a blob storage, a REST API,
just log the time and log whether it was successful. And if it failed, log whether it failed. Having this in your logs when someone says the app felt slow means that you can now dump all this information and figure out exactly what the problem was. You can also look at historical trends and understand whether the service is usually fast and then dips at certain times or has issues at certain times.
And having this data is going to let you architect your solution to be much better. If you're familiar with system diagnostics, it has a notion of tracing levels that you can use. And so you can actually say, only log errors or warnings or info, and you can turn this on at a config switch level as to when you want to turn it on.
What a lot of people do with client apps and even say with ASP.NET apps is only turn on tracing when they know there's a problem and they want to debug. One thing I'd recommend in the cloud world since we have storage accounts that can store hundreds of terabytes of data is actually run your solution with errors, warning, and probably information always on. So you're always logging the information.
Every time you make an API call, even if it was successful, write to the info log, write an info trace with how much time it took. And that way you can have historical data. You can always look back at it. And if you want to, you can delete it eventually or you can free it up. But this gives you the flexibility where you're always able to understand
when someone says, I had this random problem. It was roughly around 8 o'clock last night. I don't remember the error. Instead of just saying great, you can actually just pull up your logs and look and say, oh, it was because the SQL database was having a problem. It took four seconds to do that query instead of the usual 10 milliseconds. I wonder what that was about. And the cool thing then is with .NET
and with Azure, we actually have some built-in logging systems that you can use. For websites, again, you can just use the standard system diagnostics API. Likewise for cloud services. This will write to table storage. Cloud services lets you also count, all these things let you capture HTTP logs and other things as well. And storage analytics has a bunch of cool built-in solutions as well.
So definitely turn these on. They're not on by default. Make sure you turn them on because you'll thank me later when you're actually trying to debug a problem if it's actually on. So let's look really quickly at what this looks like. So what I basically did to implement this is I have a logging project and I have a logger interface with a couple of these simple methods. And I got information, errors, and trace APIs.
And right now, my implementation is really simple. I basically just call into the standard system diagnostics trace API. So you might ask, why not just call it directly? Well, the beauty is if I ever want to add some more logic inside my code, I now at least have an extensibility point that I can do it,
where I don't have to rip out all my other logging logic and put it there. And then basically, to wire this up inside my service, I'm just using my dependency injection capability inside ASP.NET with AutoFac, and I just say, everyone that expects an iLogger interface pass in this logging implementation. And so now if you look at, say for example,
my fixit task repository class, it takes it as a constructor argument, and now every time it makes a database call, I say stopwatch.startnew, do the call, stop it, and then I log it. I put the method, I mention the dependency that I'm accessing and how much time it took.
And now, earlier I ran through that app, if I go and look at my server explorer, one of the things I can do is look in my tables, pull up my log, and what you can see here,
that was actually last night, in case you were wondering why that failed. I was wondering why that failed as well. It turned out I had a goof in one of my scripts. But you can basically see here, here are all the calls that I've been making throughout these demos today. And each of these times where I'm actually accessing and calling a service, so for example, I'm calling my SQL database, you can actually see there's the call,
there's the method that called it, and this is exactly how much time it took in production when that call happened for the SQL database to return. So it took about 69 milliseconds to retrieve that call. And again, if you go spelunking through here, you'll find that there's cases where, hey, it drops, or it's variable.
So this time it took 122 milliseconds. Okay, that's interesting data. And so now I'm logging all this, and I can figure it out. If I've got my blob storage every time I upload a new file, notice here's the photo service. I call out blob service here, and then again, I can go ahead and look concretely how long did it take. It took 211 milliseconds to upload that file from my app.
So all this stuff's logged. Once you have the logging implementation in place, just a couple extra lines of code every time you call a service. But now every time something goes wrong in my app, every time someone says they run into an issue, I know exactly what the issue was. Both of it was an error, or even if it just ran slow, I can now pinpoint exactly which thing I depended on
was actually the root cause. So simple model, simple to turn on, and all you need to do in order to enable it on a website is simply go into the portal, and again, you can automate this through the script, or what I've done here is you just go click on the Configure tab in the portal
if you want to do it manually, and you'll notice down here there's something called Application Diagnostics. You can write the logs into the file system or to a storage account. Just turn it on, choose what level of information you want to log, and then just on Manage Connections, just basically pick a storage account
and paste it in a string, and we'll start automatically dumping the logs to them for you automatically. It takes about five seconds for them to show up. So really simple way you can do logging, and I guarantee you when you have production issues, a combination of a telemetry system and a custom log will really help you. The next pattern we're going to talk about is called transient fault handling.
This is another important one that's really important to think about in the cloud, and it's kind of unique to the cloud. Logging in general you should do in every app. This one you should probably also do in every app, but it's also something that you really need to think about in a cloud environment, which is that often if you're running in a service, you're going to have little glitches that will happen. They're not outages,
but the database connection might drop more periodically, and that's because you're going through more load balancers than if your web server and your database server are sitting together in the same server room on premises that they're doing. So you'll see more connection terminations than normally happen. Sometimes you'll see cases where if it's a multi-tenant service, some customer starts pounding on the service,
and your impact starts dropping, and so things might start getting slower or timing out more. Or the service might start throttling you where it says, hey, you're hitting it too frequently. Back off because you're kind of denial-of-servicing me. So all these things are kind of regular in a cloud environment. One of the things you want to do is have what we call kind of smart retry back-off logic
so that when this happens, instead of your app just throwing an error, you can actually handle it in a more graceful way where you can retry, and usually you can recover without actually having to throw an error, or often you can recover without having to throw an error to your customer. And there's a couple different approaches you can do. There's a cool library from patterns and practices.
Where if you're writing say rawadio.net against a SQL database, you can basically use this retry application block where you basically just say, hey, I'm connecting to a SQL database. Give me a reliable SQL connection, and then I can specify if a connection drops, how many times do I actually try to reconnect,
and how long do I wait until I reconnect. And so this is going to say retry three times and make sure it's less than five seconds in order to actually do it end to end. Otherwise, just give up and throw an exception. But this way, if you have a temporary connection drop, your app can recover without you kind of unwinding all the way
and throwing an error to your customer. This works with rawadio.net. One of the things we're doing in the Entity Framework version 6, which is also the version that adds async support, is baking this directly into the Entity Framework, because typically with the Entity Framework or an object relational mapper, you're not working directly with connections.
And so the new EF6, you can basically, just as part of your configuration class, just say, hey, I'm going to add a SQL execution strategy. Again, retry up to three times, make it up to five seconds, and that way if the connection drops, the Entity Framework will handle it for you and your app doesn't actually have to write any custom logic. And then again, if it drops more than three times in five seconds,
it'll finally throw the exception and say there's a problem. And SQL has this built in. The storage client library for Azure has this same type of logic built in, so you don't actually write any code in order to enable it. And all I've done to use this inside my class here
is I basically just added an EF configuration class where I basically just turn it on. And that's all you need to do, and now you're going to be much more resilient to database connection drops and throw a lot less errors. One thing I'll just mention, anytime you kind of have an error and you retry,
is to be aware of if lots of people are having errors and everyone's trying to retry, you can either basically denial of service the actual service itself. If, for example, you have millions of, there's millions of people that are all trying to hit it and retry, and you want to be careful of that. The other thing you want to be careful of in your app
is if you keep retrying and retrying forever, you can often just back up your customers so that the web servers get slow because everyone's in a queue waiting to try to hit the database. So be careful about what the max threshold you try to set at is. Generally, I wouldn't try to set a database to keep retrying on a web request more than about five seconds
because otherwise your other customers are just going to get put in the queue. Sometimes it's better to just bail out and throw a friendly error to the customer saying, sorry, try again, as opposed to just sort of continually backing up forever. But definitely add this type of logic into your app. It'll make it much more resilient to failures. Other quick pattern I'm going to talk about here,
and this is probably the last pattern we'll do and then we'll break, is something called the queue-centric work pattern. And this is sort of a different way than a lot of people that have built, say, traditional web apps have necessarily thought of in the past. And it's basically a pattern that you can use in a variety of places, both for web apps, mobile apps, and other apps, which effectively adds an intermediary layer
between your web tier and your back-end service. This is useful for a variety of different things.