An Actor’s Life for Me
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Alternative Title |
| |
Title of Series | ||
Number of Parts | 170 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/50614 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
CodeGame controllerBitProcess modelingComputer programmingMultiplicationCycle (graph theory)Physical systemDisk read-and-write headState of matterRange (statistics)Point (geometry)Power (physics)Right angleDataflowSlide ruleGodVirtual machineCollaborationism19 (number)Computer architectureMessage passingSequelLocal ringGraph coloringInternetworkingDemo (music)Source codeInstallation artMereologyPattern languageProcess (computing)Graphics tabletProgrammer (hardware)CodeArchaeological field survey.NET FrameworkOnline helpNP-hardParallel computingFreewareWeightTwitterCartesian coordinate systemServer (computing)Bit rateBefehlsprozessorConcurrency (computer science)Armstrong, JoeComputer hardwareAreaPhysicalismComputer animation
05:42
Boiling pointWater vaporConsistencyStatement (computer science)DampingState of matterNumberWritingSlide ruleOrder (biology)Different (Kate Ryan album)Process modelingTelecommunicationLink (knot theory)Multiplication signGraphics tabletFood energyGraph coloringRight angleMathematicsMessage passingPhysical systemBoiling pointResultantBitEvent horizonSingle-precision floating-point formatProcess (computing)Finite setPoint (geometry)Water vaporPoisson-KlammerInformationSeries (mathematics)IntegerVotingParallel computingPiArmstrong, JoeSequenceComputer architectureInformation securityTelebankingPartial derivativeWebsiteModemSystem callEnergy levelMoment (mathematics)Universe (mathematics)Computer animation
11:24
Boiling pointWater vaporACIDMultiplication signTelecommunicationRing (mathematics)TwitterMulti-core processorFood energyStatement (computer science)BefehlsprozessorMessage passingDifferent (Kate Ryan album)MathematicianPower (physics)Thread (computing)Energy level3 (number)BitEntire functionState of matterElectronic mailing listPoint (geometry)2 (number)Term (mathematics)NeuroinformatikBroadcasting (networking)QuantumSemiconductor memoryStructural loadCode.NET FrameworkHand fanComputer programmingProcess modelingWater vaporBoiling pointPhysicistWordPhysicalismFunction (mathematics)FLOPSToken ringQuantum mechanicsCore dumpLevel (video gaming)IntelGastropod shellObservational studySystem callMultiplicationWindowFamilyNormal (geometry)DeadlockVariable (mathematics)Virtual machineComputer animation
16:35
Goodness of fitPoint (geometry)Service PackParallel computingLink (knot theory)DataflowSpeicherbereinigungNormal (geometry)Shared memoryService (economics)Software frameworkBlock (periodic table)Thread (computing)Message passingState of matterTask (computing)WindowPortable communications deviceTerm (mathematics).NET FrameworkPattern languageInternational Date LineSet (mathematics)Structural loadSynchronizationArithmetic meanSocial classWrapper (data mining)Computer fileDependent and independent variablesMedical imagingoutputType theoryProcess (computing)Exception handlingMultiplication signSource codePlanningData structurePhysical lawMortality rateForm (programming)BuildingLibrary (computing)PropagatorMultiplicationMereologyOnline helpSoftware testingEvent horizonDirected graphData conversionComplete metric spaceComputer animation
21:45
Data typeVideoconferencingLengthElectric currentView (database)Gamma functionFluid staticsOvalVideo game consoleArchitectureString (computer science)Computer filePhysical systemDirectory serviceStreaming mediaBefehlsprozessorRead-only memoryMiniDiscConvex hullReading (process)Holographic data storageOnline helpBuildingExecution unitCompilation albumSturm's theoremDisk read-and-write headDuality (mathematics)Complete metric spaceProcess modelingMaxima and minimaBlock (periodic table)Parallel computingSynchronizationInternet forumTask (computing)Point (geometry)Power (physics)CodeThread (computing)Block (periodic table)InformationAreaMultiplication signDigital mediaVideoconferencingLine (geometry)Type theoryWindowComputer fileDataflowInternetworkingMemory managementGroup actionFile systemoutputTask (computing)Right angleBitDigital rights managementProcess (computing)Semiconductor memoryDemo (music)Computer configurationVideo game consoleCache (computing)Parallel computingElectronic visual displayMoment (mathematics)Server (computing)Set (mathematics)System callIntrusion detection systemCryptographyPhysical systemCASE <Informatik>Mobile appMereologyCore dumpVirtual machineConfiguration spaceData structureCalculationHypermediaMultilaterationProcess modelingParameter (computer programming)Computer programmingStructural load.NET FrameworkHash functionDifferent (Kate Ryan album)EncryptionObject-oriented programmingDegree (graph theory)Library (computing)Information securityGoodness of fitTwitterCausalityWebsiteFacebookGodField (computer science)DivisorRange (statistics)String (computer science)Exclusive orLoop (music)Category of beingLimit (category theory)Bit rateWordMessage passingIdentity managementService (economics)Standard deviationNeuroinformatikSynchronizationComputer animation
31:53
Group actionBlock (periodic table)Computer fileTask (computing)Thread (computing)IntelSurjective functionSinguläres IntegralLink (knot theory)MiniDiscOrder (biology)Term (mathematics)Block (periodic table)Video game consoleFunction (mathematics)Computer fileForm (programming)outputStatement (computer science)CodePoint (geometry)BitSelectivity (electronic)Process (computing)DiagramQueue (abstract data type)Group actionTouchscreenLink (knot theory)MeasurementArithmetic meanTransformation (genetics)Source codeThread (computing)Parallel computingFluktuations-Dissipations-TheoremDataflowEvent horizonBuffer solutionSynchronizationMessage passingDivisorProcess modelingReading (process)Ocean currentWindowWave packetMultiplicationView (database)Context awarenessMultiplication signSolid geometryPolygon meshTwitterSource codeComputer animation
36:59
String (computer science)Fluid staticsThread (computing)Task (computing)Video game consoleStreaming mediaDirectory serviceComputer filePhysical systemComplete metric spaceMultiplicationGroup actionInternet forumParallel computingOvalView (database)Block (periodic table)Link (knot theory)Drill commandsElectronic visual displayOrder (biology)MathematicsMiniDiscGroup actionDegree (graph theory)Block (periodic table)BitThread (computing)Process (computing)String (computer science)Computer programmingWeb crawlerSocial classHash functionPhysical lawFile formatWorkloadTouchscreenComputer fileLink (knot theory)Arithmetic progressionParallel computingFunction (mathematics)CodeTuplePropagatorGoodness of fitVideo game consoleComputer animation
40:44
Singuläres IntegralBlock (periodic table)Computer fileLink (knot theory)MiniDiscGroup actionGroup actionOrder (biology)Block (periodic table)Buffer solutionQueue (abstract data type)Electronic visual displayMessage passingVideo game consoleDiagramComputer fileMiniDiscThread (computing)CodeCuboidPressureComputer animation
42:49
Inflection pointOnline helpString (computer science)Physical systemComputer fileChannel capacityBlock (periodic table)Fluid staticsDirectory serviceVideo game consoleTask (computing)Thread (computing)Computer programProcess modelingOvalRow (database)Reading (process)Execution unitMultiplication signBuffer solutionPropagatorElectronic visual displayOrder (biology)Block (periodic table)Link (knot theory)Thread (computing)BuildingComputer programmingMobile appNumberSystem callLine (geometry)Complete metric spaceElectronic program guideCodeWindowConfiguration spaceLoop (music)Reading (process)Electronic mailing listVideo game consoleInstance (computer science)Message passingRight angleServer (computing)Computer fileMaxima and minimaDegree (graph theory)Closed setCore dumpSelectivity (electronic)Queue (abstract data type)Hand fanComputer animation
45:59
Link (knot theory)Order (biology)Multiplication signGraph coloringRoutingWeb crawlerThread (computing)Radical (chemistry)PlotterFunction (mathematics)View (database)VideoconferencingBlock (periodic table)Message passingRecursionCASE <Informatik>Bit rateStatement (computer science)Digital mediaSelectivity (electronic)MultiplicationDefault (computer science)Forcing (mathematics)Web pageRight anglePoint (geometry)Link (knot theory)Software frameworkMedical imagingBuffer solutionType theoryContent (media)Electronic visual displayGroup actionResultantSingle-precision floating-point formatState of matterBus (computing)Bit5 (number)RootTranscodierungDifferent (Kate Ryan album)Filter <Stochastik>Computer animation
49:12
Core dumpMenu (computing)Online helpPredicate (grammar)Block (periodic table)TrigonometryBuildingBefehlsprozessorThread (computing)Set (mathematics)VideoconferencingStructural loadCodeContent (media)Block (periodic table)AlgorithmLink (knot theory)Different (Kate Ryan album)Entire functionLine (geometry)RootTransformation (genetics)Menu (computing)Multiplication signComputer animation
50:55
Office suiteFreewareWebcamMultiplication signProcess (computing)Web 2.0Channel capacityWeb pageBlock (periodic table)Function (mathematics)Software frameworkMedical imagingFrame problemQueue (abstract data type)Recursion2 (number)Message passingProcess modelingCodeSurfaceStapeldateiEmailSoftwareOperator (mathematics)HypermediaMobile appReading (process)Object-oriented programmingWordState of matterNumberType theoryDemo (music)Broadcasting (networking)VideoconferencingToken ringMusical ensembleException handlingKey (cryptography)Category of beingSet (mathematics)RoutingInformation securityStructural loadLattice (order)Demoscene1 (number)Fluid staticsConnected spaceWebsiteComputer programmingWorkloadDisk read-and-write headDifferent (Kate Ryan album)FrequencyDataflowPolygon meshBitComputer animation
55:12
String (computer science)OvalFluid staticsMiniDiscStreaming mediaOnline helpMenu (computing)Directory serviceMultiplicationProcess (computing)Channel capacityToken ringView (database)Computer fileBlock (periodic table)Dirac equationProcess modelingElectric currentVideo game consoleCorrelation and dependenceOperator (mathematics)Token ringSheaf (mathematics)DatabaseProcess (computing)BitThread (computing)Block (periodic table)Process modelingCodeCausalityWordWindowDatabase transactionMessage passingComputer fileSocial classBuffer solutionDataflowException handlingSource codeComputer animation
57:02
Exception handlingBlogDataflowCausalityProper mapComputer fileBlock (periodic table).NET FrameworkDatabase transactionMathematicsTwitterSeries (mathematics)Nichtlineares GleichungssystemMassLie groupMultiplication signGodPresentation of a groupComa BerenicesPoint (geometry)Computer animation
58:38
Software developerSoftware testingFactory (trading post)FlagProjective planeMultiplication signOpen sourceProcess (computing)Queue (abstract data type)Online chatSign (mathematics)Stress (mechanics)CodeMachine codeComputer animation
59:28
Normed vector spaceContext awarenessSoftware frameworkEvent horizonProcess modelingSource codePhysical systemView (database)Service (economics)Virtual machineMessage passingMedical imagingThread (computing)Data storage deviceCuboidWordVideoconferencingComa BerenicesBlock (periodic table)Buffer solutionComputer animation
Transcript: English(auto-generated)
00:02
Hopefully, that was a bit more interesting than normal slides. This is .NET. Was anyone in Joe Armstrong's talk? You missed a blinder, the rest of you, I tell you. He made a joke about .NET programmers, and now it's a bit weird,
00:21
being in a room with lots of .NET programmers. This is .NET, and we're going to be talking about the actor pattern, and something called TPL Dataflow, which is a NuGet package, available for C-Sharp, and F-Sharp, and VB.NET. That's the talk we're going to do, how to make your programs concurrent if you're working in .NET.
00:42
If that sounds like the kind of thing you want to be at, you're in the right room. If it's not, feel free to go elsewhere. No leaving halfway through, otherwise I have to throw something at you as you walk past. Let's get going. I'm not looking at tweets. I'm using this to control the PowerPoint.
01:03
If I'm lucky, I don't drop the phone, which I did at one of the talks, and had to put it back together, which is interesting. All the codes and slides are already on GitHub. If you go to GitHub, WesleyL, and it's an L, not a 1, you'll be able to find all the slides, all the source code, and go up there.
01:25
Given that most of you don't know who I am, who am I? Was anyone at my Await Async talk yesterday? You've seen this slide. I work at Huddle as an application architect, an old startup that does document calibration.
01:42
I cycle about 6,500K a year to get to work. When I'm at work, I'm mucking about with my kids and my dad at a farm when I went on holiday at Easter. So that's me. I'm a C Sharp MVP, and my job is to help create the architect at Huddle, make sure we're doing the right thing, right patterns and practices.
02:02
We have a wide range of tools. It's not just .NET. We've got RabbitMQ. We've got GetEventStore. We've got SQL Server. We're using Lucene for search. We're quite agnostic as long as it does the job. This is why I asked about Joe Armstrong. In his talk, and the one yesterday,
02:23
he was talking about the fact that we have these systems that are massively parallel now. We have CPU architectures that are multi-core, multi-threaded. The question is, firstly, how does someone like Microsoft write tools that we can use that take care of this, and then how do we write our code using those tools
02:42
so that we can take advantage of this speed? One of the guys at Microsoft said, there used to be a free lunch in programming. You used to be able to write your code, and within a few years, the processors got faster, and they got faster. You could just write your code a bit more verbose and complicated,
03:00
but it still went faster than the code you wrote five years ago. Then at some point, Intel decided that unless they were going to ship a fridge with every PC, they were going to lower that clock rate and instead put more processors in. Now we have more processors, but less speed. How do you use that?
03:21
The reason you've been watching a little bit of history repeating is because this isn't new. Marie Antoinette apparently said this before or let them eat cake or getting their head chopped off or whatever happened. There's nothing new except what has been forgotten. When I was looking at the TPL data flow, and I thought, great, this is a good thing to look at,
03:43
and it's based on the actor pattern, and you go back and look at the actor pattern, and you look for white papers on the actor pattern, you're going back 41 years to when they were first started to be discussed. This is a little bit of history repeating. Karl Hewitt wrote a paper in 1973 discussing the original actor pattern.
04:05
He wrote it with some other guys. He then wrote another paper in 1977. This was a technical discussion of how you can create systems that have lots of processors,
04:21
each with their ability to process locally. By avoiding certain pitfalls such as shared state and using a message-based architecture, we can actually get highly concurrent, scalable systems. That's really cool. The interesting bit was they weren't on multi-core, multi-threaded CPUs.
04:43
They had big rooms of hardware. But we now have those multi-core, multi-threaded machines. We're there. One of the issues with concurrent systems is state. That's going to be one of our biggest areas.
05:02
If you share state between processes, you're going to have a bit of a problem. I've got a physical demo here. I'm not sure it'll work. I'm going to have to throw it. I'll try not to injure someone. One half of the room. The aim is you write your favorite color.
05:22
I haven't got a pen, have I? You'd write it down. This side of the room, you write down the favorite color, and then you pass the pad to the next person. This side of the room, you rip a sheet off, hand the pad to the next person, and then you write down your favorite color. That's it. That's a simple demonstration of shared state.
05:45
Over here, we have shared state. There is only one pad. Every color is going to be written on that pad. On this side of the room, everyone has their own state. They all have their own sheet of paper, and the pad just gets passed to the back.
06:03
If these guys are fast, which they're quite slow at the moment, they will start passing that pad back faster than this side because this side is having to wait. The issue we'll have with this side, without that shared state, we have to collate the results at the end.
06:21
We're going to have to gather all those bits of paper, put them all together, and we may not get them in the order we put them in, but we get all the colors. If we were just doing a vote and doing a pie chart, we're fine. We'd be able to do that. Here, we have shared state. We could actually see the fact that if everyone's written red, someone might change their mind and write red
06:40
because they've looked at it and it said red, and they think, yeah, red. Or someone's put tangerine, and I didn't think of tangerine. That's really good. Obviously, I don't really care about what colors you're writing down, but that's a demonstration of how quickly you can do things. Parallel processing versus single processing.
07:00
What's the solution to this? Let's approach it like Carl Hewitt wrote up in the actor model. We now have a partial order of events. We have a series of things that happen, maybe in order, maybe in a known order. They may happen slightly out of order.
07:22
If you've heard of eventual consistency in CQRS-type architectures, you'd be happy with the idea that things come out of order, and eventually you reconcile it and it gets to the right place. Your bank does this all the time. If you go to online banking, you might have 200 euros in the morning,
07:41
and by the end of the day, it's gone down to 10 euros. If you're drinking in Norway, it's now minus 200 euros because you've bought two drinks. The point is, if you actually look at that, at the end of the day, you might have a big, big debit. Someone's taken a lot of money out at the start of the day, and then a bit of money came in later.
08:01
Actually, when you look at the statement, it tends to reorder it, and you don't see it in quite the same way. If you go at the end of the day, suddenly your statement items are in a different order because it really didn't matter. It's consistent by the end of the day, and that's what matters if they're going to charge you interest. We can handle that. We don't have this sequence of states.
08:21
We don't have a global state in the entire system. They're not all using the same bit of information. If you have a finite number of actors, things that process information, and there's a finite number of links between them, and there are no messages between them,
08:40
we can start using mathematics to prove how testable, how scalable it is, and you actually find you get a reliable, scalable, testable system by doing this. If you have state, we have to start analyzing state models. Joe Armstrong was happily saying that
09:02
in order to get, if you take every atom in the planet, it takes six 33-bit integers to have the number of states of every atom in our planet. That's how fast you get complex. Obviously, what he didn't mention is every atom has protons, neutrons, electrons.
09:20
Those electrons are in different energy states, and there's an unbelievable amount of state in an atom. But the problem we have is the state explosion. Someone did a very good example, Yuting Chen, of a state explosion, to try and illustrate this issue. I've clearly ripped off this slide,
09:41
attributed it, and pointed you at the slide that he wrote. He did talking and itching, so someone talking to someone and then getting an itch and having to scratch it at the same time. Because I'm English, I changed talking to making a cup of tea because we'd rather make a cup of tea than talk to someone in England.
10:00
If we're going to make a cup of tea, how do we do that? First of all, we're going to boil the kettle, an essential item that, unfortunately, my hotel has decided to neglect to put in my room, but I brought one, so it's fine. We pour the water into a mug.
10:20
I'm assuming you've put a teabag in there before you poured the water into the cup or the teapot. Then you add the milk. You never add milk before the hot water. This is just insane and wrong. If you do that, you're not drinking tea. You're drinking something you'd buy in the U.S.
10:41
Right. The other thing we might do is we might get a scratch and we want to itch it, or we get an itch that we want to scratch. That's easy. I didn't have an itch, and then I did, and I scratched it. That's the only states we have. The problem is, what happens when I start to make a cup of tea
11:00
and I get an itch? Because we had it simple. We had four states and two states. That was really easy. Imagine now, we're starting off, and this little bracket underneath represents the two states of making a cup of tea and a scratch. The second number being the state of making the cup of tea.
11:22
We're going to boil the kettle. Let's take it over again. Oh, it's reset the slide. How annoying. We're going to boil the kettle. We're going to pour the water, and we're going to add milk. That's pretty easy. That's what we had before. We had this, we were making a cup of tea earlier, and that's exactly what we had. The problem is, we got an itch.
11:41
We got an itch before we did anything about tea, so that gives us a seventh state. Over here. We might have actually had the itch while the kettle was just boiled. Obviously, we've now itched, and we've got a boiled kettle, but we haven't put anything in the water. It might have happened just as we were doing the mug or when we added the milk.
12:00
Now we've got eight states to deal with combined. If you have a program that only has eight states, well done. But I guess you have more than eight states in your software. This gets worse and worse and worse. If you start sharing all that state, you get unpredictable code.
12:21
You don't know how things really work, and it's harder to test. The solution is this actor model. We're going to avoid shared memory and shared state. Instead, we're going to have messages or tokens passing between things that do work. They're actors. If you were a F-sharp programmer, you'd talk about agents.
12:43
People have different words for the same thing. We process inputs and provide outputs. Someone gives me something, I do something to it, and I give something back. We don't have lock statements and unlock statements. Anyone like playing with lock statements?
13:03
Anyone got deadlocks? Yeah. Thread A is locked against the data that thread B wants, but thread B is locked on a different bit of data, and now they're dead. It's like a traffic jam in the middle of Rio. We have horrible things like locks. We have thread state.
13:20
I don't really want to share variables. I'll shove it on the thread. There's this random bag that I can put data into. We don't want to have that as well. We are saying we are going to communicate between our processes purely with little messages. Those messages are isolated. Those actors only understand what comes in, do something, and then spit it back out.
13:41
The other thing is if we have no global state, if we don't have this concept of global state, we can't broadcast a message to everyone. If I wanted to send a message to everyone here on Twitter, I'd need to know everyone's Twitter handle. Someone would have to be making a list. If someone walks in, there'd be a new person on that list.
14:01
The next time I need to broadcast it, I have to go to this shared state over here of someone who's made a list of all the Twitter handles. We can't broadcast messages either. That's fine. It's interesting, and now it makes more sense having listened to Joe Armstrong, who's a physicist. In 1977, it was mainly physicists and mathematicians
14:20
doing computer white papers. They said the actor model may be difficult to understand for people, so to make it easy, we'll discuss it in terms of quantum physics because that's easier to understand. I like the concept. If anyone has done quantum physics, the example they gave was if you have an atom
14:40
with a load of electrons around it and a light photon comes in, it can hit an electron, give it energy, and it moves up. If you did physics when you were at school, there were these funny rings around an atom that had energy levels for electrons to be in. Then what might happen is it loses the energy and goes back down, at which point another bit of light comes out,
15:00
and that hits another atom and causes it to all happen. That's the example they were giving. By the way, the electrons going in an atom and then coming out and all that is how a fluorescent tube works. That's how fluorescent light actually works. Having made it easy by discussing quantum physics,
15:22
we'll discuss how it works in .NET. Think about this. The Intel Pentium was a thing coming out when .NET came out. We had .NET at the same time as multi-core CPUs were starting to hit. We had SMPs.
15:41
If you were old enough, you might have had your first multiprocessor machine, and they had these Pentium IIs or IIIs that were in a slot, which was an entire circuit board with a fan stuck on it and extra memory glued on it. You shoved it in a massive slot inside your PC, and when you turned it on, it sounded like
16:01
the police had sent a helicopter that was hovering over your house. It was great because they were like 1.5 gigahertz, and it really went through things really fast. Now we've got multi-core. Intel next year, it's a specialist chip, but it is a chip. It's 700 mil squared, so it's a big chip,
16:21
and it's got its own memory. It will have 72 atom cores on it, each with four threads, which delivers 288 threads, or three teraflops of processing power in one chip. If you write your normal C-sharp code,
16:41
there's going to be a lot of idle chips on that. To be honest, it doesn't work like that, and you're not going to run Windows on it, but it gives you an example of what's actually physically possible now. We've got this problem, but it's okay because the .NET guys were thinking about threads. If you were in .NET 1, we had a thread pool.
17:00
It was like a wrapper around the Win32 API that said, hey, here's a thread. Have fun with it, but do it really responsibly because it's quite hard to do. It was okay because in .NET 2, they gave us a synchronization context, so now we could try and synchronize across threads and make sure they knew kind of handled over each other.
17:21
They even had an event-based synchronization pattern, so they had a best practice for writing threads and using threads. Nothing really happened in 3 and 3.5. Then in 4, we got the task parallel library. Now, that was cool because now the key one was the concurrent collection.
17:41
We now had a thread-safe dictionary list-type structure, which was great, which means you could actually start sharing data, although you shouldn't really want to share data because that's actually not a good idea, but, you know, let's get around that. You had parallel form, parallel link came in. Then in 4.5, the world changed forever,
18:01
and we got AwaitAsync, which was great because it allowed you to do multi-threaded processing. No, it allowed you to have a single thread really on your UI, and in the background, it was using IO blocking tasks were being multi-threaded in the OS. That's how the generic patterns come out.
18:21
So it really doesn't help us in a pure CPU-intensive load. It helped with IO-intensive loads to offload things, so that was an easy way for you to make a UI responsive and handle things like that. So, here we are. We've finally got to TPL Dataflow. But it's important, that history,
18:41
because that history is what has made, got us to this point with Dataflow, and it's a NuGet package. It was CTPed. Has anyone used Dataflow? About four or five years. Has anyone heard of it before this conference? Better. This is three years old or something. The original CTP was released
19:01
at the same time as Async Await. I didn't really hear about it until I was tech reviewing this book by Apress. I didn't write it, but I tech reviewed it, Pro Asynchronous Programming, and it's by far the most interesting chapter that I read, because it was like, where the hell has this been? No one talked about this. It wasn't in the .NET framework.
19:21
It's a NuGet package. That's one of the issues. That means they can keep updating it much faster than the .NET framework. It means you can release it to your service with that assist admin going, 451 service pack one of the framework. We've got to regression test that. You know, that kind of thing. So that is a good point, but the bad point is,
19:40
it wasn't there in the box, and Anders Halsberg didn't do talks about it, so you didn't know about it. So, but importantly, it's .NET favorite 4.5. They have been updating it, because one person said, have they just left it alone? No. They've updated it to work with 8 and 8.1. They've made sure it follows the portable class libraries,
20:00
including the permission to use it on something other than Windows, which means you can use it on Xamarin type stuff. So we can use it on OSX, and you can use it on Mono on Linux. You can use it on mobile devices. If they fix the garbage collector in Mono, it might work, but they won't, according to James over there. So everything is awesome.
20:23
Hopefully someone gets the reference. It's all about blocks. So the terminology in the TPL data flow is all about blocks. So everything is awesome is the song from the Lego movie. If you don't have children, you may not have gone. If you don't have children, you should still go, because it's a very good movie.
20:41
So we have these blocks. And a block can be a source. It can emit messages. It can be a target to receive messages. And if it does both, it's a propagator, because it takes messages in and does something and then passes them on. And we can connect blocks together. So it is like a little Lego set. I'll just build this today. And we can filter.
21:00
So we can say, hey, I want to know about creation messages. I only want to know about them to do with image files. So that's the only thing I can talk about and deal with. So that's cool. And we had get completion, cancellation, and exception handling. This is all from the TPL. It was all there, so they've used it. They've brought it in, just like they did with async reusing stuff.
21:22
So the point of this framework is to abstract that message handling, and we focus on inputs and outputs and links together, and we don't have to worry about all the stuff to do with getting threads out of a thread pool, reusing them, closing down securely, making sure we don't share state between them in terms of things inside those messages
21:41
don't necessarily share state. It handles queuing. It handles a load of stuff. So everything is awesome. At this point, you would have the song playing, but we have copyright issues, as it's going to be available online. So that's enough waffling. That's 20 minutes. That's for death by PowerPoint.
22:01
So we'd better go on to some code. So, yet again, I've got a code example that deals with files. So Intel, I used to be in digital media type area because I keep talking about files. If you've ever been to the UK,
22:21
there's a company called the BBC who do TV, and they have something called iPlayer, which is responsible, like Netflix, it's responsible for a huge amount of internet traffic in the UK because you can watch everything that's broadcast virtually on their channel on iPlayer. Obviously, you can't do that outside the UK. They even have it so you can download it
22:41
and watch it for a few weeks. They only support Windows and a few iOS devices via another method, and the Linux guys are angry about that, so they wrote something that can download any video off BBC iPlayer and take the DRM off it, and then it will last forever. Obviously, this is not legal. So if I did do this,
23:01
I would only do it for an example in a talk, and I have paid a license fee, and I would have only watched those within one week of broadcast, and never again. Okay, so taking that caveat for why these files might be on my disk, we have really big video files, and we're going to do MD5 checksums. So it takes a while to checksum a 2 gig file.
23:24
So it gives us something that takes time, which is when you're doing parallel type stuff, you want things to take time because it makes the demo so much better. So all we're going to do is we're going to go to a folder
23:41
on our hard drive, and we're going to process it. So this is a console app. It's just in main. So there's a few right lines so we can see what's happening to the console. When we process a folder, we're going to get all the files in that folder, and we're going to go through them, and we're going to work out what the MD5 is.
24:04
This gets faster as my talk goes on as NTFS filing system decides to start caching the files, and we'll cache up to 10 gig in memory. But it is a good demo. So Blade Runner, that's a smaller one because it's a small documentary. It's probably only about 45 minutes.
24:21
That's the hash. Bluestone, half hour comedy. That's going to be reasonably fast. When it gets to Edge of Darkness, which is a HD film that's 90 minutes long, you're in trouble because that's a 2 gig file. It takes quite a while. So you can see, there we are, Edge of Darkness coming up. So it's a standard synchronous program. It is loading one file, working out the MD5,
24:41
telling you what it was. You know, it's still fast. If I'd have done that when .NET 1 came out, my God, that would have taken a long time. What's annoying is I have that many cores. I have four cores, each with two threads, and I'm using one of them to do my MD5 calculations.
25:03
The rest of it is obviously used for Twitter, Skype, media downloads, Facebook, you know, a load of other stuff. Essential, but not quite as essential as the task we're trying to do. So what we want to do is add some TPL dataflow niceness. Now what we did earlier today is I managed my NuGet packages
25:26
and I downloaded TPL dataflow. I decided I wasn't going to do it live because who knows if the internet's up at the moment and I add that NuGet package or the NuGet servers up and you don't want that. So we have NuGet there. We've got the TPL dataflow.
25:40
So we've got this code. It's really simple. It does an MD5 from a file. You'll notice, by the way, that we're using the cryptography library in here. MD5 compute hash, which is MD5 create, and MD5 is part of the encryption library, system security cryptography, and there is no async way of doing an MD5,
26:01
so there's no way I can do an async await type structure on this, so it's a bit of the .NET framework that isn't awaitable, so I'm always going to be in trouble. So okay, so we're going to make this into something that I can put into TPL dataflow. So first of all, what we're doing really is we're just displaying the MP5 from a file.
26:25
So let's create a block and we'll use the basic, most basic thing that's in that TPL dataflow library, an action block. We're going to give it a string,
26:40
which is the file name, to calculate an MD5. We're going to make that call a method called display MD5 from file on console. Ooh, that's for later.
27:01
That's an extra bit that I'm going to add. Okay, because I've got ReSharper, Alt-Enter, create new method, bang. Okay, so that's fine. So we've created this display MD5 from file on console. We're going to refactor our code by just pushing this in. So what we were writing on the console
27:21
in that tight little loop, we're just going to go, hey, that's now in this method. It works out the MD5, it shoves it on the console for us. And we're going to create this block that takes file names and calls that method. That's all it does. And TPL handles it, creates a task, calls the method, wires it all up for us. So all we need to do now
27:41
is tell that block it's got some data. So all we do is say, I'm going to post some information to you. It's like posting a letter. So we post this letter in. It's going to take a file path, it's going to do the work, bang, done. Really nice bit is
28:01
we can tell that, by the way, I've finished sending you all the files. And actually, I'm going to wait for you to finish doing all your jobs. Now the interesting bit here is because we're using Dataflow, it's queuing for us. So we can throw loads of file paths at it and we don't have to wait for it to finish the first one.
28:25
So in this example, we've gone from standard synchronous code to something where Dataflow's handling for it. Oh, yeah. I have to have a chat with Hadi from JetBrains.
28:41
He didn't guess what I was going to call the parameter. It should do. All right. So all I do is run that and it does exactly what we saw before. But it's done it where I haven't had to think. It's kind of doing it with tasks, but you can't really see that. You see it went a lot faster.
29:01
That isn't because it's now using Dataflow. That's because the NTFS file cache is caching the crap out of my files. But it's still stuck on edge of darkness because it hasn't loaded it all yet. So that's fine. So your reaction here should be, and yeah, what did that gain us? So what advantage have we just had?
29:21
And the answer is pretty much none. We haven't got any advantage over our synchronous code. The magic bit of pixie dust is we can actually configure our block. So there's something called Execution Dataflow Block Options. And that's just a set of parameters
29:41
we can hand in to any of these blocks in Dataflow and say, hey, do this, do this a bit differently. And one thing we can do is we can say max degrees of parallelism equals four. Okay. And then when we create the action block. Oops. Did that not like it? It didn't, did it?
30:03
Shouldn't have that on the end. Okay. So when we create our MD5 block, we can give it a block configuration. That's where the semicolor should be. Okay. So we have said, if you can, use four threads.
30:23
So pretty much you can tell it, please use 100 threads. And your system hasn't got 100 threads on a multicore, even on my machine. So it's going to struggle. So you don't really want to give it more threads than you really have available on your processor. But now we can do four.
30:40
We can run it. Woo! Four at a time. Now that was really easy. Something just handled all the thread pool action, reusing threads, reusing tasks, and it does reuse the threads. We can copy a bit of code in that will actually tell you the thread IDs. So just in case you don't believe me,
31:03
we can go to a little code snippet. So I will now paste in our nice action block. It's all ready to go.
31:22
It works. And now we should have thread IDs coming through so we can see that threads aren't being reused. Tasks are being reused. We're not leaking memory. We're reusing as much as possible. It's fantastic because every thread that you create and rip from a thread pool takes a bit of time to construct, to give it a memory allocation.
31:43
So I think this is really cool because we just went four-threaded on a standard asynchronous task very quickly with very little effort. So what we got here in terms of a diagram is we had one little block.
32:02
You can see that it has its own queue inside. So we're able to keep firing file paths on it. And we said, hey, do this bit of work. It doesn't output anything. It outputs something to the console. And we made it run with four threads. So here's a summary of what the action block really is.
32:22
It's the most simple of the blocks in the Dataflow library, I said. It's a target only. No output. It's called a greedy block because we haven't told it to limit its buffer size. We can actually do things like say, hey, you only have a buffer of two. So only take two items at once.
32:41
Work on one of them and you've got one in reserve. That kind of thing. We can easily increase concurrency. You will have noticed or you may have noticed if we go back to the output is that edge of darkness which is the fourth file was not the fourth file to be displayed on the screen of an MD5. So our output on this because it was inside the block
33:02
was out of order of the input items. So the really nice bit about that is edge of darkness did not stop us processing the other items even though it was a really long file. So now we're starting to really gain the advantage of being multi-processing. And it just executes code.
33:21
So it's a really dull thing. So it's a bit like a train that just goes one stop. It goes in and does stuff. So that's great when you're getting the express train from the airport into Oslo. Airport, Oslo Grand Station, out. A bit rubbish if you're trying to get a train to somewhere else like Zwanga.
33:45
So what we really would like to do is have more than one action. So we'd like to transform our data. So think in linked terms of a link select statement. So we are going to be a target for messages coming in and we're going to be a source of messages coming out.
34:04
And we can pump them, create this kind of pipeline or mesh of processes and join them up. And this is really important because we can start separating our concerns. If you're used to refactoring and TDD and kind of solid principles, you should be already going,
34:22
that created an MD5 and displayed it on the screen. And those are two different things. We should be creating the MD5 and then we should be displaying it on the screen. Now you may go, that's a bit of a stretch and it's a bit of a creating it for no reason. But if you work in WinForms or WPF
34:42
and you're trying to get data back to your UI, you don't want to be doing that from a background thread because there are synchronization issues. So if you can separate your output or your events of the final data from your background threads, we can actually synchronize that last little bit of work with the UI thread.
35:02
And we can do that by using the current synchronization context. Can't do it in a console window because there is no UI in a console window. There's just the console. So this is where you can start separating your jobs. And the important bit is displaying it to the console is really fast and you don't need to parallelize it. But you do need to make sure
35:21
it's on the thread that the UI might be on. And we've got another one called Transform Many Block that can take one input and output many outputs as well. So that's quite handy. That's like a select many. So let's see that as a diagram. So nice thing about Transform Block,
35:41
we have built-in input queue and a built-in output queue. Dataflow handles that for us. We don't need to think about it. We can do it multiple concurrency. We can put four threads on it. Bingo, we've got four threads running. And we can link it with this link to statement to an action block to display the file name.
36:03
And that can do the job of just displaying things onto the screen. Really cool bit about a Transform Block, and this didn't happen with the action block, is the input queue is synchronized with the output queue. So if I put things in in order,
36:21
they come out in the same order I put them in. It doesn't mean we wait and hold up things in the input queue. We can sort them all out in any order we like inside, but it ensures that they come out in the same order. That's actually not an easy, trivial thing to do in your own code with concurrent queues and locking
36:42
and make sure that we're putting the items into the queue correctly. For someone else to do that for you, that's a win. Because that's a bit boring. That's less time on Twitter. So why would you do that? So that's a really simple view of it. So we'll go back to the code.
37:02
And because I know when I gave this talk a week ago in London at something called Progressive.net, I had three hours in a tutorial. I don't think you want to spend three hours here. So the next speaker might get very annoyed
37:22
as we're out in the corridor trying to do the work. So realistically what we're going to do is we're going to separate our display block. And I've copied the wrong one, have I? Display block, action block. That looks awfully like.
37:41
I haven't copied it, and I didn't. Good. That's better. So we're taking the code we just had. It had four degrees of parallelism. But now we're separating it out. We're having a create block and a display block.
38:00
So the create block we're going to give a file path to. When I first wrote this, I gave it a file path, and it was really easy because all it did was output the MD5. And it was really good because you got a screen saying, hey, I'm starting processing this file. And then it said, hey, I've got an MD5 for you. And you went, for what? Which file is an MD5 for?
38:23
So immediately you're going, right, I need to do something. I could do my own little struct or my own class, but I've done a tuple of string string and value one is the file name, value two is MD5, hash. And that means that when the display block gets data, it knows, hey, this file name, this MD5.
38:42
That's all it is. And what we do is we link our creation of MD5s to the display block. So we say, hey, this thing creates MD5s, and it gives out file names and then MD5. That goes into the action block,
39:00
and the action block is displaying it, and all it does is display it. So MD5 with file name, that's really easy. All it does is return, hey, here's the file path, here's the MD5 for that file path. It doesn't display anything to the console. And then the display block is the thing that actually does the displaying.
39:21
We can say to the create MD5 block, we've finished putting file names into you, and then we don't wait for that one to complete, we wait for the display block to complete. And the reason we can do that, and this is really cool, is we have actually said, not only do you, after creating your MD5, link to the display,
39:43
if you're finished, tell everything that's after you that you're finished so that they can tell everything that they're finished, so that they can tell everything that's finished, and you can go down the whole pipeline collapsing your workload. There are reasons you don't do that, but if you start fanning out and having lots of blocks separately,
40:02
you can't start propagating. You have to be a bit more careful. But this is kind of neat. Yes, let's stop debugging. We have just built on a synchronous program, put one action block in, changed it to a transform block, now we're going to buffer it.
40:24
And it looks pretty much the same. Apart from we now have begin and ends in a slightly different format because we've separated them out. But we're getting the same thing. We've got four threads, and look at that. Look how it's cached all my files already.
40:40
It's amazing. So that's great. So we've got them actually in that. They're in order. So we have them coming out in the order that they were on disk. And that's great. And that was caused by the transform block
41:01
keeping everything in order for us. We said use four threads, go for it, just push it out in order. The first thing I got from someone was what if I don't want it in order? And you go, yeah, I can understand that. You want the data to come out as fast as possible. You don't want to wait for that 2 gig MD5 to be calculated.
41:20
So then we can do things like a buffer block. So a buffer block is one of these glue things. It doesn't do any work, so it doesn't execute any code, but it just manipulates data and pushes it from one block to another. It's basically just a queue. That's it. So we shove messages in it,
41:40
and it puts messages out to other blocks, and we link them correctly. If we have that, we can do some really nifty stuff. We can put a buffer block that accepts all the file names that we want to deal with. We can then say, I'm going to send you to four different transform blocks. Not one with four threads, but four with one thread each.
42:03
And we can make the queues, on this diagram there's a big queue, we can make it just one item. So the buffer block will happily for you keep polling the things it's linked to, saying, do you want a new message? Do you want a new message? Do you want a new message? Do you want a new message? And it keeps doing that. And as soon as they're able to take a new message,
42:20
it hands it to them. And those middle blocks, when they're finished with it, they throw it to the action block that we've already got that displays to the console. So now we get out of order execution. If it's a tiny file, it will race through the pipeline. If it's a big file, it'll slow down. You know, maybe this one down here
42:41
is going to go slowly, but these three can deal with all the small files and come out the other end. Select all, come on.
43:06
Right, so how hard is this? So we're building on our stuff we had before. We've now decided to stop them being greedy, and we're going to tell it to only have one item in each queue. So the queues are only going to have one item in.
43:21
We've given up on multi-threading, so max degree of parallelization, that's gone because we're going to single-thread all these things. We do a buffer block over here. We do a display block. That's the thing that displays the file name in the MD5. We're going to create a list of create blocks. So these are the things creating the MD5.
43:41
So we're going to do a nice little for loop instead of just cutting and pasting the code four times because this could be in your configuration file. It might be your code actually runs on a server and knows that there's eight cores, and there's going to be this many threads, and it's programmed to use two-thirds of the threads and can happily dynamically switch when it gets deployed on other instances.
44:01
So it doesn't matter whether you're using a big, small, or medium instance on AWS. It's going to know about it and put the right number of cores on. So in a homage to VB, we start at one and not zero. My gag, it's human. So we've got four blocks.
44:22
They're the transform blocks we're used to having. Our block configuration, I called one item at a time, so everyone knows explicitly what's happening with this. I link the buffer block to that create block. So the buffer starts the job. It's going to link into this create block. We're going to create four of them.
44:40
And that create block is going to link to our display block and fan back in. That's it. We get the files. We post them into a buffer block. The thing that you may have noticed is propagation completion has disappeared. So completion propagation is gone, but I'm not waiting for things.
45:01
I've done a really hacky read line to stop the console program from ending. Remember, this is all happening on threads that I don't own, so if I just carried on to the end of main, I'd end the console and it would just shut down. So at the heart, it's still a Windows app. It's listening to a Windows message pump, and if there's nothing listening to a Windows message pump in Windows,
45:21
even if it's .NET, the most modern, whizzy bit of .NET, your app closes. You know, that's how it works. So if we do this, we should now... Go, go, go. We should be getting out of order execution.
45:43
So if this works, we're getting MD5s first before we get that big MD5 from the 2GIG file edge of darkness. So that's quite cool. We now have out of order execution, and we're done. So that's good.
46:01
So we've now got... We can have it keep it in order and make a simple switch to use four threads. We can have four threads across four blocks and have it out of order. Someone did ask in one talk, can I make the end bit, the display, display them in the same order as I put them in on the buffer when I've spread it out in that four blocks?
46:21
And the answer is, no, we have no global state. So you aren't going to do that. How would I have made sure that all our colors were in order? I could have asked the person who took the first page to write one, the next person to write two, and pass it on. So we could have an ordering in our messages.
46:41
We could have someone at the end collate it and get them back in order. So you can write that. It just isn't in the framework by default. So that is not an issue. It's trivial to do. And to be honest, you'd normally, if you really want to preserve order, just multi-thread a single block and it'll preserve order for you.
47:01
Now, filtering. So I said before that we can filter messages. So when we link this block to that block, we can say, yes, but I only want messages of this type. So this is like link with select and were clauses.
47:22
It's a case select type thing. So I used to work in digital media. So we would have different transcoders for images, for the image artwork, for the audio, for bits of video that they put on CDs. So even if you're just selling a CD, they'll shove some video of the making of some song on it.
47:41
So if you want to listen to Christina Perry, one of her videos on one of her singles about how rad the director is, you can. Every time I rendered that and had to test it, my skin crawled. There's nothing radical about that song. Although there apparently were lots of candles and it was really warm if you light a thousand candles.
48:02
No shit, Sherlock. So the point is we can link stuff. Which means we can do recursion. And the old joke about for recursion, see recursion. One has to be careful with this kind of thing in recursion.
48:22
There's a reason case select statements have a default for when you haven't got a case that it matches. So I would be careful about using this. But we're going to do an example. So we had that transform block that was going to handle our MD5 creation.
48:42
So we're going to get the folder contents. Instead of a buffer, we're going to just post a folder in, a root folder. And when it comes out, it's going to say if it's a file, I'm going to link it to a transform block that makes my MD5s. But if it's a folder, I'll send it back to myself.
49:02
And then it's another root folder that I'll then go for. And another root folder that I'll then go for. And then it outputs to the action block. So this really is very simple recursion. So I can now go new folder, put Blade Runner in it,
49:21
copy those two, put those two, put that into folder three. Put those in there. So I've set it up so we now have nested folders. Great. With loads of video. And we're going to do MD5s on that. And here, this is relatively easy.
49:41
We've got a filter transform menu. About the only difference you will see on the entire of this is that. And this one. So there's two linked to. So the folder contents block that gets a root folder and finds things in it.
50:04
Now it doesn't just get files, it gets files and folders. And it says if it's a folder, link back to me. Whereas if it's a file, link it to one of the four MD5 creation blocks in the middle. That's it. Those are the two lines of code I had to add to this.
50:37
Let me run that.
50:41
And there it is. It's now recursing through. We'll know it's worked because the old code would never have found W1A in Timmy time. But now it's recursing through all those folders. So we can now have recursive algorithms searching for our data and processing our data. And that's quite cool too. Because that wasn't that hard to do. So you can now see how you can get quite complex pipelines.
51:03
There's other blocks in this framework. So I've only scratched the surface of how you can link these things together. We had maybe six blocks in a little mesh. These can be quite big pipelines. Out of interest, if someone might go, and what about Orleans? Orleans is out of process, message processing like this, supposedly.
51:25
This is all in process and the goal is for it to be in your own code. Broadcast block. I said you can't do broadcasts. Oops. Well, you can. But the broadcast block doesn't broadcast to everyone.
51:41
You link it to a set of blocks to which it will broadcast. The important thing about a broadcast block is if it's ever asked for the latest message, it only gives you the very latest one it was ever given. This, imagine you had a webcam. Has anyone ever had a webcam at work where you have a web page looking at the output of the webcam
52:02
so you can see what's happening in the office or a coffee pot or something, that kind of thing? The webcam may be doing 20 frames a second. It takes about half a second to load the image via FTP to the website. Therefore, if you actually try and process it every 20 images a second, you will be behind very quickly. So really all you want is the latest state of it.
52:23
I actually used one of those webcam apps once at Cello Media where we had a secure ID key that had a unique ID on it for accessing a website, and we shoved a webcam on it so that we didn't need to ask who's got the secure ID tag. I suggest you never do that.
52:40
So that's not what they're there for. A batch block. We can actually say things like, you know that if you're on a mobile device or you're doing web calls, you might want to bundle your web calls up. Every time you get a message, you don't want to necessarily say, hey, I need to look up a post code. Hey, I need to look up a post code.
53:01
And every one of them has to create a TCP IP connection, post it, get it back, handle all the headers. So a batch block says, I will accept in my queue this number of messages, or over this time period, I will gather them up and then push them on, which allows you to do things like every 10 messages or one second,
53:22
I will go off and batch this process to an outside external process and be much more efficient in dealing with things. And it does it for you, and you don't have to think about it. Joins are where you want to get two inputs. So I've got nodes that could do work, a load of jobs that need rendering,
53:40
and I want to join them together and say, hey, this node's free. You can do that job. Join up, push it through. So where we had fanning out, we can now get two pipelines and they can come back together and join. So that combines two into one. The write once block, you give it one message, and that's it. That's the only message you can ever give it,
54:00
and you can keep asking it for that message, and it will keep telling you that message. It's effectively a kind of private and static read only singleton type block. We've got conditional linking, which you've seen. So we can happily do recursion or we can do specialist operations. So huddle, we also have software
54:22
that converts Word documents, Excel, PDFs, images, video. They all go different routes. If you have video, it goes through FFmpeg. If it's Word, it goes through Oppose. All that can be handled in a pipeline. It's not handled with Dataflow because it wasn't there when they wrote it. It's handled in Rx, which is much more scary than Dataflow.
54:41
It is. James disagrees, but he's probably used Rx quite a bit. Dataflow, you could hand to a junior programmer, and they'd probably cope with it a lot better. So we've got some block properties. We've got non-greedy blocks, which we've seen. We can do a banding capacity of one.
55:02
We've got cancellation and exceptions. So we can actually hand in cancellation tokens into these blocks. I've got a little demo of that. So if we go into Solutions for a moment, we've got a cancellation token.
55:24
Then, if you were used to seeing cancellation tokens inside of TPL, it's pretty similar. We have a cancellation token. We've decided to make that as a private static read-only in the whole class. That allows us to include it in multiple blocks
55:42
so we can hand these blocks into things. A cancellation token in TPL is just something you get, and you go, hey, that token, throw a cancellation in. The really nice thing about Dataflow, compared to when we were discussing a way Async where exceptions get generated, is Dataflow handles every block being allowed to process
56:04
the message that's currently being processed, but it will not process anymore from its input buffer. So if you want every job to complete inside every block and drain nicely, this is ideal. If you want it to cancel in the middle,
56:20
then you can. By the way, canceling is just going that cancellation source dot cancel. And if you want, you can actually do things like access that shared state, which we shouldn't really have, but cancellation token is a thread-safe item
56:42
under Windows, maybe not under Mono, but it is under Windows. And we can say, have I been cancelled? So that halfway through a job, if I know there's an intensive operation, so that block does about three things, and we might want to cancel midway through. I can detect it, but I have to write that little bit of code.
57:00
But it's relatively easy, and the draining of all those blocks is quite nice. The idea that it allows all the work to finish. There's not many people who want to write half an XML file or trash a transaction halfway through and not roll back and cause issues with databases, so it's a wise move to actually close down properly. And we get aggregate exceptions,
57:21
because in a block, we may have multiple things going wrong before we find out. So we need to be told that transform block that was full threaded, all four may have had an exception, and we need to know about it. So it comes out as an aggregate exception. So the resources that I have for this are the white paper. It's worth reading the white paper.
57:40
If you're not into maths, you stop reading it about a third of the way through when it starts being littered with equations. The guy who did analyzing state, he's got that very good presentation. Again, it goes heavily into the maths. But the maths is what proved white. It's a theoretically good idea to do this. This book from Apress, I don't get extra royalties by plugging it,
58:03
but I quite like it as a book. And Stephen Taube's original white paper on data flow came out in April 2011. So it is over three years old. At that point, it was working .NET 4, but it now needs 4.5. And a guy wrote a decent blog series
58:20
analyzing that white paper and studying how it worked. I'm Liam Wesley. You can get me on Twitter. Clearly, I keep mentioning it, so I must be on it all the time, at Wesley L. Or you can email me at liam.wesley.huddle.com I have a blog, which I hardly ever update because I'm too busy at Huddle nowadays. I'm going to thank Huddle because they allow me to work on stuff like this.
58:42
They give me 20% of my time to work on community or open source projects or research projects. And they pay me to be at a conference like this, as in that I don't need to take leave. And we have jobs. And I meant to tell you, we have jobs in QA and Dev. If you want to work in London, which you probably don't because you're probably quite happy
59:01
living somewhere in Europe. But if you want to live in London, if you want to go to a medium-sized start-up that's getting traction in the US, then come and have a chat. I get a signing bonus, so clearly I have an incentive. But we will get a meal out of it. And the most important thing is...
59:21
Let's go. The code for everything I've shown you today... That's typical. Just when you need it to work. It's up on GitHub.
59:40
Any questions? Yes. James. Yes, you can give it a synchronization context. So you can hand it a context to say you run over here on these threads. So you can choose...
01:00:00
to execute things on certain threads. Obviously one of the goals was for you not to have to worry about that if you don't need to worry about it, but when you do need to worry about it, yes, you can specify it. Okay, well, oh, sorry.
01:00:24
Yes, realistic, this is an in-processed framework, so yes, you're talking about going to more cores, more threads on the same box. But in my view, you'd normally have a system like this, my buffer block would normally be taking messages off RabbitMQ, it would be spread across multiple machines, it takes 10 messages at a time, then pumps them
01:00:42
through the framework and does the pipeline there. That's how I would scale it. You might have multiple services, one of which only does images and video on a high-end machine, and one does just simple word docs because they're easier to convert. So nothing says you can't combine this with another messaging infrastructure like RabbitMQ, like MSMQ, and use that to do
01:01:01
messaging, or even get event source and take messages off event source and process them. So, you know, get event store, get the right name. Thank you very much.