Add to Watchlist
The Toolmaker’s Guide
Series
Annotations
Transcript
00:01
right so hello everyone the purpose of this talk today is really just convey
00:08
my personal approach to tool development in if I can sort of distill some guiding principles for my own experience my own experience on how to design uh effective tools and but maybe you know principles is kind of overselling and a little bit it's hard to define these things concretely and so maybe I'm just gonna give some handwavy in but advice instead and but 1 articulate my thought process and motivations behind some of the tools that you may have used such as the 3 in top adjacent and although this talk was conceived primarily as a top for Toolmakers on how to make better tools and I hope that it also speaks to people that use tools as well and which is really all humans and but especially people that are attending this conference and so you know at Albert makes you consider how the tool you've chosen to use them affects your work and whether you know maybe you could switch to a different tool that suits your needs more precisely how or even whether you want to become a tool maker yourself but to build the perfect tool the now before I can really and er convey how to design a tool effectively I think we can have to step back a little bit and ask some more basic questions about you know how tools affects the creative process and I think the role of tools is often taken for granted I mean they help us a lot we use them I mean use them so we don't really think about the the role that they play so much so i'm gonna ask some perhaps obvious questions of basic questions now but I think it's a useful starting point like what is the tool anyway and why do we use them is what role that they play and my answer is that tools are really big tools exist to increase are efficiency where efficiency is a measure of the value produced per the effort expended value can mean the quality of the good you're creating or the quantity the number of things and an african mean the amount of time it takes for and is the level of exertion required or even mental anguish right is the emotional state when you're doing something obviously matters and and so for physical tool that might mean like applying leveraged to people in a lot of wall or for virtual tool it might mean new reusing a library in order to build a map more quickly and but the point here is that tools have extremely extrinsic value rather than intrinsic value which means they have no value inherent unto themselves and I think you understand this right like you can't eat a hammer it isn't feed your family directly you have to use it in order to produce value no and so even though they were asking a fairly obvious question here I think there's sort of a nonobvious implication which is that you know as a tool to tool maker you have to remember that your tool has only extrinsic value so you can't get too enamored with the idea of the tool in the abstract you instead have to care about how people actually use your tool in practice another important consideration is the tools do not increase our efficiency uniformly great tool makes some tasks easier and some tasks harder and the reason for that is the tools are designed by people or in the worst case by committee to solve particular tasks and its tasks within that intended said that are made easier and outside that intended set are often made harder and that's simply because you tools don't make the task easier by chance they have to be designed intentionally to to address the specific problems so as a result of that tools then influence how we solve these problems so all creative processes require sort of choosing between different competing approaches and as rational actors who want to choose the sort of sequence of operations that maximizes the total produced value relative to the amount of effort and so that might mean we wanna make the best graphic before deadline right we're maximizing value or it might mean that we want to balance multiple simultaneous projects that minimize the amount of effort or could even mean that you know we wanna do reasonably good job and we still want spend some time with their families and so tools are really changing the inputs into this cost benefit analysis that we do and when we decide how to solve a particular problem and to me this is like related to the attitude you probably heard uh Maslow's hammer if all you have is a hammer everything looks like a nail and and to me that also speaks to this influence is sometimes subconscious it's not all is over obvious how a tool is sort of changing of our decision making process as we create something and no 1 of those
05:07
sort of canonical examples I think of in my field at least and is Excel you know if you wanna make a scatter plot in Excel Excel makes a bunch of choices about how that scatter plot should appear rate like so it chooses that there should be a legend in the title which is based on the header row here why not bring hopefully and it chooses the number of text to draw on the X axis and the y axis of the color and font those texts and then of course most importantly it chooses how the actual data glyphs the dots in the scatter plot peer and so we should really know admired the attention to detail that went into these dots frame there's a diamond with this vertical linear gradient the goes from a light blue to medium blue is dark blue edge weights rounded corners and look gorgeous dropshadow and it's really stunning amount of effort that went into this Dr. M N a I think they're they're hoping that everyone will really be impressed with your scatter plot as a result of all this design work but I mean I same being sarcastic I don't think that's this is really the the right that you should use in most applications and particularly because like all these little details sort of contribute a lot of visual noise and really you want to understand pattern in your data and having a lot of noise in these glyphs just ends up being distracted so this is the type of that maybe it works if you only have 12 points in a scatter plot and their which case you may be should be just part of drawing a table anyway and the spring run values I am now of course excel doesn't force you to make a scatter plot this way and it has an extensive user interface for customizing the appearance that scatter plot and but the issues like even if it's possible to customize that it's still pushing towards that default here so just because something is possible does not mean that people will actually do it you actually use this functionality and that's especially true of the interface for customizing it is sort of an afterthought is partially neglected it's hard to use like this complicated sequence of menus and it's not always clear what these things are doing and so we have to think about this trade in and and what the interface is pushing us toward words now this is a slightly more
07:28
subtle example but something I think about as well so you may recognize this this is a detail of the liquid crystal display and these are the sort of red green and blue elements that comprise pixel and so each of these little lights are individually addressable as sort of the color of each pixel and this type of display really is the basis of the most common color space that we use with computers RGB and part LCDs costs displays had similar red green and blue dots M and so that the RGB color space is this the direct control over the hardware and that's not entirely true because there is a slight transformation of done in terms of your color profile and the temperature in the gamma and but relatively speaking it's a fairly low level interface in in order to color but now RGB is perfectly fine color space and particularly for things like photography you don't really care can produce a large can fairly easily but it's not a great color space for visualization and that's because it lacks this property of perceptual uniformity which means if you increase the intensity of let's say the green channel by a fixed amount you perceive that differently than if you increase the intensity of the blue channel by the same amount right so the difference is distances in color space don't correspond to our perceived differences and that means that this color spaces basically distorting your data that we use RGB in order to encode data now of course we use RGB everywhere so this is an example that I mean Sister toy Korolec left map it doesn't have a real data here the colors actually the area of each of these counties from and United that's because I was lazy and uh if you do random copyleft it's completely useless like area is actually a fairly reasonable proxy because there's a geographic it's easier than picking a dataset but but of course this is a pretty terrible Korup left and both because and the RGB color space is not a great way of encoding data and because the particular colors that I think here of steel blue steel blue and brown are just kind of model so there's not a lot of contrast here and but I'm including its example merely to show how an indie 3 if we don't do anything you were encouraging people to use this RGB color space which is not a good choice for visualization and furthermore if we build these examples we don't do a good job of picking colors were encouraging people to use these bad color scales so 1 things that I've tried to the since then is all 1st we added them support for these perceptual color spaces in the 3 so this is the same for blood but now using the HCl color space there's 2 color spaces and that sort of correspond to RGB and HSL so 1 is Lab color and the others HCl and so HCl as a cylindrical costly so you can specify you in anyway that I hope you can see that this is obviously a better color scale for the score of left but has better contrast and also the changes in the color actually correspond proportionally to the data and then similarly another thing I've been trying to encourage people to use like however is an excellent set off the shelf color scales and then there is a perfectly fine interface on color virtue not work for picking out your color scales but I think there's still more that we can do to make those easier to use so that includes the including examples of the used however scales and the D 3 repo has uh JavaScript and CSS interface to cholera to make it easier to apply the skills to your maps to visualizations and I will submit just simple example that lets you point and click in order to get in string of colors and you can plug into any scale so really it's about sort of identifying how you want people to do things the right way and then making those ways sort of more discoverable and easier for people to adopt right your your lowering the barrier to entry from now this year I also discovered this really interesting color scheme called keep helix by the brain sort of more popular in astronomy 4 point 2 astronomical images and it has this nice property that has a continuous increase in perceived intensity in a from light to dark but it also does them a rotation through he was well to try to increase contrast and really this is intended to replace about ubiquitous rainbow colors scale that you see in a lot in scientific visualization has a lot of really bad properties in terms of perception and but the thing that I especially love about this color scale I mean so this particular 1 is sometimes called the ugly watermelon it it's not maybe the most esthetically pleasing color scale and but in fact uh you helix is really more of a color space than just a single color scale so you can create a new color scales by simply drawing different line through that color space and so this is an example which is actually fairly similar some of the other cholera scales so you can create your own color scales that have that nice property of perceptual uniformity and but still have sort of control over the esthetics as well the now this is
12:56
what the code looks like and obviously you adjusted saying these are the stock Cohen and color and then we're interpolate between those 2 colors and I'm only including is really just a show that there's no real difference in the API when you switch from 1 color space to another color space racist manner of replacing interplay RGB with interpolate HCl or interpolate it's and so from an API desires perspective and what I find so compelling about this is that this is making a dramatic improvement in terms of the quality of the visualizations were switching from color space that distorts the data to a color space that is higher fidelity and along with that there's really no cost to that change because you just sort of changing 1 word to another word that you're not really increasing the API surface area and there's a little bit of implementation complexity which is sort of my initial objections adding this feature but really what you care about is the API witness has minimal API weights and a huge increase in value so I include that the core applied example as well to illustrate that you know were thinking about how our tools are influencing the creative plot process it's not just the tools themselves the matter but the examples in the documentation and tutorials the related materials that we provide with that tool and as well and in er you know particularly powerful in this regard is the sort of starting point it's easy for people to pick up an example and just to ignore all the rest of the documentation and then this happened and so if you're example and you know you made it fairly quickly and maybe you didn't consider all these like subtle aspects you know examples can also be dangerous in that regard in terms of making it easier for people to replicate that practice like the steel group still blue to brown color scale um so you know what we're making an example it's often the case you're trying to demonstrate 1 particular good practice in you inadvertently demonstrated bad practice at the same time because it wasn't really the point you're trying to make so am when I'm training samples I try to be careful about that and also to articulate sort of what the intent of that example it so that if I do do something silly because it's fun or because I'm lazy and I just wanna do this quickly enough that as long as you're making it clear to people that you they shouldn't do this so that this is a bad practice and here's why then at least a mitigating in some of the rest and and then just as a personal example like I think the example that I am most ashamed of is the D 3 show real which we made and when we published a paper in i tripoli InfoVis initial real was trying to showcase like all the different transitions that were possible within the 3 sort of went through 8 different chart types and I can show it because I'm embarrassed by it now but and even though a cool demo of all the different functionality of the 3 the problem was and people saw that sense and they were like OK how can I get my data into the show real I'm about numerous emails from companies from people that want to replicate the show real with their data and it's just like really frightening because it's a it's a horrible interface to that racialization but I didn't make it obvious when I publish that example that the purpose was served just demonstrate functionality and not be usable interface I'm not try to do a better job
16:30
since then as I said my examples and hopefully I have started to drown out some of those bad examples more good examples as well
16:39
so I said previously the tools can make things harder which seems somewhat counterintuitive right like the tools just make things easier and if the tool is intended for a particular task can I just switch to a different tool that is intended for that task and but the problem is you can't do that and there's this concept of viscosity which comes from a framework called the cognitive dimensions of annotations and the framework is really designed to evaluate sir how easy or hard to use them software it's in this concept of viscosity represents sort of this resistance to change I like you have a piece of code how easy is it for you to edit this piece of code to do something else maybe this framework is really intended to analyze software I think it applies equally well to sort of humans in the creative process using tools in the sense that you know we develop these learned behaviors you become familiar with particular tools become experts in tools and that makes us resistant to them not use those tools in the future and you know if we have our favorite tool even if we know that that tool is not really the right tool for the task if it's the tool that we're most familiar with we're gonna have a tendency to use that as well and to some degree this is rational right like you don't want to be constantly learning new tool every time you do something you want to leverage that 1 behavior but at the same time there's a brisk that you become overspecialized and then you become inflexible and and unwilling to change so you have to balance those 2 things and so even though examples can be dangerous in terms of like propagating bad behaviors and and the cost of good and propagating good behavior 1 of things that I think is universally good about examples as they are lowering discussed i because they're making it easier for people to pick up the tool them so you and and that that's true of of documentation as well right you're you're lowering the barrier to entry and if you think back to this concept of extrinsic value and I feel like documentation tutorials examples are really 1 of the best ways you can spend your time as a tool maker and because you're not you know maybe you're not in drug addressing like the particular functionality of the tool but you are making your tool accessible to more people and you making tool Umbria making those people use tool more effectively so you're you're able to produce more extrinsic value you know you're not working on the tool directly OK so here's a shameless plug ability for a letter I wrote recently that I encourage you to check out if you're interested in the 3 in top adjacent about making it a bubble from for yes counties so
19:16
just to recap the 1st part and so all tools have bias because they're designed with a particular intended set of tasks in mind and there's no single tool that does everything equally well right there's no Sonics screwdriver from those of you their fans of Doctor Who and in we can't be there there a cost associated with switching tools as well so we can't sort of constantly just switch to the right tool for the particular task at hand were always going to be resistant to change to some degree and thus the tool that we have chosen is going to affect are and result and we therefore have to keep that in mind as we're designing something try to make that sort of subconscious subconscious influence more explicit so now I wanna look at sort of and more specific example hopefully relevant to this audience and about map projections often when we think of map projections it's sort of tempting to think of them as these point transformations right you have a function that takes a point in longitude latitude and returns a point in X and Y and then it's a very deceptively easy approach and it matches the mathematical definition right when you go to Wikipedia and you look up the definition it's gonna say like x is some function of lambda and the the same thing for Y. and so often you might see code like this right this is the implementation of the spherical Mercator projection simplified a tiny bit and this seems unit radius for the sphere but this is really all that code does and now unfortunately when you think of projections this way it ignores the critical step and which is that in order to project sphere down to the plane and you have to make a topological change those 2 objects are topologically distinct so you actually have to cut this year in order to flatten down to the plan and this is often explained to you the newcomers by using the analogy with an orange rind like switching that down onto the table and the mapping kind of get away with this continuous representation because you can just think of it as an infinite number of points in just mapping 1 point from the sphere down the plane but that's not true with computers because we're not dealing with continuous representations at least in most cases right we have discrete geometry we have a set of points we have lines we have polygons and so that adds a level of complexity to the projection process and we have to change our implementations accordingly and so you know this is what it looks like when we're cutting the sphere and flattening can't planes and so any geometry that crosses this orange line here either crosses the into or it goes around poll that has to be cut in order to to faithfully reproduce it on the plane now I said you know people often think of productions as point transformations that so how is it how it that they even work and the answer is that they rely on the geometry being precut so very often those sources of data that we have already have this cut baked into the geometry and like the download data from natural earth because along the poles and the integrity and for you so you know the base the someone else solved the problem for you in so if you're just implementing your own map projection system it's very easy to ignore this this critical problem and but this ends up or not being a source of i right like you can no
22:48
longer change your central meridian because you do that you end up with that's right and sort of reminds me a little bit of that the CR tracking problem and for the the old enough to remember what a VCR's so right so there Arak good aragonite have another projections so this is a conic production which obviously don't use very often for doing the whole world but I'm including it is a particularly bad example because what you have here is Antarctica which goes on the south pole is then basically inverting the entire map because the the polygon and that sort of wrapping around the entire top so what we do indeed 3 is that
23:34
we apply this cut dynamically so after rotating the geometry to the desired central meridian we then applied at that and that means that you can now take any and 1 all of these projections and just freely choose what you want to use as your central meridian and so same same idea here with the Winkel triple actually sounds a little bit easier to see sort of what's going on because basically as the polygons search hit 1 edge of the screen you can see the line going back to the corresponding edge on the other side of the screen and if you like in the case of Antarctica it's it's always on there thing fixed and now this is true not
24:14
just of sort of chronic and cylindrical or pseudocylindrical projections and also true as of objections as well so this is an orthographic projection and so the naive solution would just be to call the points that are at 90 degrees away from the origin and this actually works sort of OK but as you can see there are some artifacts here because we were not able to sort of fill in the gaps the part of the pollen on the cult so instead of just drawing a straight line between the 2 points that were closest to the horizon the city 3 actually supports 2 different types of geographic putting right now 1 is there and Meridian and cut and just along any Meridian and then the 2nd is a small circle so relative to the origin and in this case we're doing at a 90 degree radius on the circle and then Jason Davis is actually working on some other types of geographic thing to to give us more flexibility in the future and some other really cool operations as well like and Boolean operations on spherical geometry From the point this is is you can think
25:26
of projections as point transformations you instead have to think of them as geometry transformations and so rather than just being a function that maps a point to point you have think about a mapping from points to 0 or more points for example at that point it's called the or you may have multiple points if you have a composite projection with all these different use like it's very common to show a map of the US and you also have to Alaska Hawaii and in some cases it may be possible to have the same thing represents twice as projection system needs to be able to handle those scenarios and similarly like a line can be cut or clipped and appears multiple lines Neapolitan can become eclipsing
26:05
now let's say we build a general projection system that solves all these problems and it turns out we still have sort of a residual influence scan anybody like identify what is wrong with this map strongly multiple things from access to right so I think somebody pointed out there is a cut here which is not a country borders and so it's a
26:32
spiking redline and so this is a line along the anti Meridian so this is a line through sort of the eastern part of Russia and the problem is even if we're applying this cut dynamically we still have the old cut that was there and so for drawing a stroke around the geometry that we got in this case from natural earth it's going to have this residual effect of the old cat and so in addition to applying this cut and
26:58
dynamically we also have to undo the effect of the cut that was originally think into the geometry so this is functionality that top adjacent provides an it's actually sort of a harder problem that it appears and it's not just a case of like you know finding of doing a little bit of pattern matching and then removing those coordinates and you can actually get cases for example where you have like 2 to review shape polygons that a cut in half and then the outer part of the polygon now becomes the exterior rings in the inner part now becomes a whole and Jason is made some really cool examples of this elaborate spiracle spirals and in order to stress tests are difficult thing approaches so again we're sort of trying to identify here and bias in terms of limiting or expressiveness that are built into these sort of simpler existing approaches and then remove those and increase people's uh flexibility in how they apply these tools now
27:55
talk about sort of rotating and the central meridian or just rotating longitude in order to change the central meridian and but if you wanted to other aspects of projections as well like transverse or 0 please oblique aspects and actually raises another problem which is that very often the geometry that we get an even if it's in sort of spherical coordinates and by that I mean at WGS 84 aliki PSG 43 26 and see think of it is being in longitude and latitude in that somehow represent spherical coordinates but a fact of a a better way of thinking about is that it's more likely carry or EEG rectangular in the sense that each of those points there it's assuming a straight line on the plane between those 2 points and so you just interpolating X and Y linearly and in your political geometry and probably is if you rotate not just longitude but if you rotate the latitude or if you wrote down as well and then you actually can change the shape and if you're applying spherical rotation to these coordinates that are in the play carrot and so instead what we need is to represent our geometry into a true spherical coordinates which means that the lines become and great arcs right so if we
29:09
want use something like this where we're just rotating an orthographic production to show different countries and and every time I see this I think of somebody for this example and added the again those countries of the world from Animaniacs they have that's I'm going in my head and or a similarly you know if we
29:28
want to like finally removed the northern hemisphere recall bias of the Mercator projection of the we really do this but it's kind of cool to watch and you we need to
29:39
have true spherical coordinates we need to represent lines not as straight lines on the plane but as great arcs so that we can just rotate them and they don't change so they they have true rotational invariance and but of course this then introduces another problem in a production system or another level of complexity which is that were no longer even if we ignore the issues of clipping and topology and were no longer just projecting or doing a linear transformation on a lot right we can just take 1 . 0 the start point and the end point and then transfer them to different locations because great arc on the sphere very often becomes curve when projected to the plane and so the way that we saw that in 3 is that we have this concept of adaptive sampling what adaptive sampling does is it basically samples these intermediate points along the line and then rejects those points and has a quality metric and which we use that was quicker same thing from Blind Blind simplification just now applied to projection and so this is how it works basically in this case were taking and the equator we have sort of a slightly off transverse projection and were bisecting each segment of the equator here it starts out as being like 90 degree segments we take the midpoint of the segments and then we project that point we just measure the perpendicular distance which is the white line between new sample in the existing lines and that white basically is our our quality metric it represents how much better the line is if we add that sample and so then we recursively subdivide we keep doing that until the line is short enough that we don't really need to do it anymore and that threshold is specified and just in pixels onscreen like in this case like 2 pixel this like and now the different approaches that we take I mean obviously we can do nothing because I'm we get what you have on the left and we just see these horrible political artifacts and but another common approach should be just uniform sampling which is but like I think the ST segmentized operation PostGIS so in this case like we're taking the equator and were sampling in every 4 degrees rather than every 90 degrees for the problem is you have the uniform sampling it doesn't understand anything about projection and so the samples that end up being used efficiently right so you have a lot of samples that are used in these flat horizontal sections which areas of sort of low curvature and but even though you have so many samples here you still have some political artifacts because along vertical edges here don't have a lot of sufficiently high density of samples so by taking the projec if you cannot merge a sampling we can then identify these Eric Chen
32:52
but we sell the sir neck and no and on Antarctica workers from for 1st started from minus 182 + 180 and if we don't do any adaptive sampling here then we get just a straight line in this case in the 93 line that sort of artifact of of how we didn't really catch on but once we apply adaptive sampling to that it can
33:15
automatically determine the sort of the right number of points that we need to do that we need to add in order to get a beautiful curve along the bottom not I mean of course it works with any production even
33:27
these like silly outdated projection like the molarity projection here yeah OK so just to recap a little bit again and will try to do is identify a bias in map projection systems that are based primarily on this point transformations because those that that approach although easy to do and ends of limiting or expressiveness in the types of productions that we can make and it's particularly bad because it's a little bit in cities in terms of how it depends on these cuts being baked into our geometry and how it differs from how people often think of these projections in terms of mathematical functions and so what we try to do D 3 is rather than just focus on sort of building a library of these fixed point transformations is instead to build a system that makes a general projection from any point transformation so that you know we we fix these problems we increase expressiveness but also we make it easy for anybody to create a new point transformation and automatically inherit these nice properties so quickly showed us some of the more
34:31
interesting projections that Jason Davies and I and have implemented a citizen recreation of air info bricks interrupted sign more by the projection and it ends of dividing the world so 1st by latitudes of the northern hemisphere I think it's 10 words yeah is a move I projection and below a given latitude switches to the appointees sinusoidal projection and but also in the northern hemisphere is divided up into 2 lobes which have different central radians and then similarly the southern hemisphere divided up in 3 loads 3 lobes which can each specified around the central meridian and sort of a more elaborate example is this a recreation Bartholomew's Riedel production by Jason Davies and so this is also interrupted and above the Tropic of Cancer it uses of to be equidistant conic projection and below that it uses an interrupted bond projection from and to do this Jason implemented on top the 3 geo polyhedron and the nice thing is a basically lets you specify any poly he and you see map spirit of parts of this year topology faces and then you can define whatever projection you want for those spaces so in this case here he's able to recreate this production and but there's another example I shall skip this 1 just so I can show you so this is where seawater butterfly maps and that that uses the same polyhedron projection but in this case using a sort of a more elaborate w 5 polygenic it is a mnemonic projection for for those faces but then another example and now this isn't really sort of interesting from a projection perspective because this is just a fairly simple transfers sinusoidal projection and but it's using replication of like drawing Earth multiple times and then using clipping and to recreate uh this is called the shoreline map by still house he did a bunch these different absences Jason was able to recreate them just by using this clipping technique and they have this nice property that they only interrupt they don't interrupt any of the oceans or the continent so they're only interrupted along the shoreline setting is a nice example of sort of how flexible this projection
36:48
system there's OK so lastly i want to announce describes a concept that they keep in mind that I think is sort of at the heart the heart of effective tool design and I'm gonna describe it really by samples of the talking a little bit about the history of 53 and and its predecessor herders the so I think we have a tendency when we are designing a tool to really think about sort of it's more superficial qualities right like it's a feature set and were for physical tool like its weight distribution how it feels its hand and so we have similarly have a tendency then to forget the problem that we're trying to solve in the 1st case and I think to some degree that's inevitable right like you kind of have to take the problem for granted a little bit in order to design a solution but at the same time if you start questioning that entirely and it's sort of sets a limit on your effectiveness and so particularly when you're starting out and before you sort of think about what the API for your library should be or whatever and you really have to define the problem you're trying to solve as as clearly as you can it for me what I look for I called the the smallest interesting problems so and the smaller the problem the easier it is to solve obviously that's a good thing if you're lazy like me and but also uh smaller problems are nice because you you make fewer assumptions about how the tool is going to be used in substantially here your tool can be more broadly applicable and but obviously you you have to still pick interesting problems because if you solve a problem that's too small it's not really going to be any more useful than just doing everything by hand so part of is as I said was the predecessor to the 3 and it was JavaScript library for doing visualization and part of is really a response to sort of the most prevalent approach to do visualization which is this chart type objects right so I showed you accelerate and but there are lots of other charting libraries like I charge for examples really popular and for doing there's ization in the way these workers they just have enumerated sort of a 6 8 different popular chart types and you just say here's my data and here's a chart type that I want and then it creates the chart for you and so the atomic operations for those tools is taking the charter or creating a chart and you can do a little bit of customization after that but I mean that's sort of the basic operations any current and and and
39:28
and there are not to recreate it right so
39:44
that's an example of sort of a lower level approach
39:46
to that you're then composing his operations to do something more complicated but unfortunately part was limited by its mark vocabulary and it seems like a fairly easy thing to do in order to just sort of find the basic shapes that you need in order to create visualisations right so this is the sort of the list of different marker types that part support that area as far as some thoughts and the image in lines etc. and it seems like the easy thing to do right there aren't that many different geometric shapes that you need in order to create charts and you can just come to compose them to do more complicated things and I think we were fairly successful in terms of recreating some of these older visualizations historical visualizations by Playfair for example or not and but the problem was even with that relatively high level of expressiveness it still wasn't enough and and that's because you knew that your browser was capable of doing these things but part of this could not do them so for example like clicking or gradients or dashed lines of or masking at work transitions or things like that and so as the tool maker I knew that I could you know sort of add each 1 of these features to part of this but it was extremely tedious to try to wrap each of these features in the library and inositol maker as a said like emotional state matter so if you wanna work on tedious tasks indefinitely especially because people are always adding new functionality to browsers and it would just be a constant game of up them no another problem
41:23
that part of his head the is that it was slow to do interaction in animation and I think part of is sort primarily conceived produce static charts so having a concise definition of those charts and but if you want to do any sort of interaction or dynamic changes to the to the scene graph it was pretty slow to do because they had to redraw everything and if you have any understanding of sort of the dependencies between the data in different parts of the representation in order to do that more efficiently a and lastly the problem part of his had is that it was hard to debug right there was a lot of internal control flow and how rendering was important and so if you had a small error typo in Europe chart definition and maybe it would work but then you you make some change in interaction and it would fail and when it failed you would end up sort of deep within the bowels part of is sort of like a very deep stack trace and it also meant because it had a specialized representation you can just inspect the dormancy what was going on like you had very little visibility and think this is 1 of those properties that's that's easy to overlook when you're designing a tool or when you are sort of evaluating tool used reading about it abstractly sort of toolmakers we tend to assume or maybe hope that people use our tools in the way that they're intended but obviously people don't do that right people aren't perfect and people make mistakes or the documentation is clear and for a variety of other conditions mean that people don't end up using tool the way it was intended and so designing a tool that fails in a way that is more usable or considering the usability of the tool under duress and or in the sort of more harsh environments I think it's extremely important as well so I'm sort of washing on my old project here but part of as I think also was a very good idea at least in 1 particular regard which is this concise and mapping from data to visual representation and you could say I have a simple examples here you you
43:26
could take an array of data and you say OK corresponds to this area shape and then you can inherit a bunch of related shapes from that so you can add mind the top of your area you got dots that and so that was a fairly concise specification and was more more declarative rather than you know having a lot of for loops for example and if you were to do everything by hand the but what we want to do with the 3
43:52
was to try to address some of those limitations and we address them by sort of removing functionality rather than just taking a different tree redefine the problem in the sense that an we didn't want to provide the specialized representation and we just want to use the standard dominant set of marks so you no longer have to define like these are the 6 different mark text you can use and instead you can use any element that's available in your web browser which could be in HTML it could mean SPG could mean sort of CSS properties all those things were available to you instead of having served instantiating a mark and then having it automatically sort of maintain that relationship between the data and the representation you know this concept of the data drawings which is sort of a transformation of the document rather than just instantiating marks so so to briefly like this is some code examples so this is a a pie chart in part of his where you're creating a panel you're associating with some data your setting some properties and then adding a wedge to that and this is the corresponding code 93 where you're adding SPG elements the body again you're associating it with an array of numbers but in this case you know rather than creating that PV . wedge which was 1 the specialized mark types were describing a path on which is a basic element that supported by SCG and memory reusing these other 2 components here 1 is called Dtree layout up I missed all that does is take the array of numbers and compute the angles the start angle and ending of each arc and then we have a thing called the 3 SCG arc which is able to take that definition and turn that into the the sequence of path operations you need actually render the arc so we sort of check a larger problem which was how do we sort of specify a graphical model and we broken up into a bunch of smaller problems 1 is like how do we actually sort of create update or destroys these elements have this athletes on these elements and then how do we sort of compute the layout like angles that we're using for these pie charts and now I think there was a lot of resistance when the 3 1st came out like some people found this to be upsetting I think primarily there and the objections overlooked sort of the benefits of adopting these standards and I think that really comes from so the considering the tools in isolation rather than within the context of the ecosystem of related tools and technologies and standards the tools exist there are a whole bunch of benefits that came from adopting a standard representation of using the Don SPG it so this is an example
46:37
made by Derek Watkins my coworker and and this is all understand SPG is duty graphics but it looks very much 3 D because you've got this sort of radial gradient here being applied to the sphere of such a thing to have for radial gradients 1 further the water and the other land that people dropshadow underneath and the bodies arcs these boxer actually implemented using 2 different orthographic projections with a different radius and so that you can distraught curve between 2 points on the surface and then another point and space and then you just sort of like layered these arcs and denies that a fading when points and that getting occluded on the back hemisphere time and resources public and sit on the objective but there's a small shadow so you can see where our travels along the surface of the Earth and its is very realistic and it's all just done with the simple composition of these 2 D graphics primitives and similarly this is another
47:34
thing that I mean it's but of a silly map but I think it demonstrates or some crazy things you can do and so in this case what I've done is to this take the land and sort of pop it out from the sphere a little bit and then there's actually 3 instances of the land being drawn here there's the sort of light beige color there's the underside of the land which is sort of clicked at a different angle that serve the when Indian occlusions a little bit darker underneath and then there's a 3rd layer which is the drop shadow and not simply done by applying a block filter to give it disappearance of shadow on on the Earth underneath and the
48:13
standard representation also means you cannot take advantage of your developer tools most additional functionality that's built into your browser and so this is another example of epicyclic gearing so I can go in here and I can say that it's inspect elements and right so now I can just
48:33
see this rotating here and I can see that the properties that are defined on in making even like that at these different properties with great changed so color and you know this helps to better understand the structure of the dominant you created right it's about removing this abstraction between and what you're
48:53
specifying code what your browser's doing a few more closely match the representation that your browser is using and then you can then leverage all these tools and not to mention like all additional documentation and materials and tutorials that applied these standards as well and another nice feature is that
49:13
you can take documents that are created by other tools like Inkscape or even hand coded HTML and you can transform those as well and and as a sort of hinted that you can take knowledge that you learned in d 3 and then apply that other tools as well as so often we think about and I mentioned this concept of viscosity before like how much effort required to learn tool or to switch tool on but if you choose a tool that's based on standards then very often you're not learning information that a specific just to that tool but you're learning information here you're learning skills that you can then take with you if you choose to use a different tool in the future and that really makes it much easier for people to do both because they can leverage sort of existing resources or training materials and but it's also more sensible investment of the time because they know that they're not really committing at time just your tool for committing it to the broader ecosystem standards that are going to stick around for a long time so then just as sum of I I have this concept the D 3 or 4 to get back to the idea of this so the smallest interesting problem so I sometimes refer to the 3 as initialization kernel and the idea is is try to identify that smallest problem and visualization like that always occurs and no matter what tool you're going to build you build other tools on top of the visualization kernel and the kernel the only thing that really does for you is it takes the existing scene graphs of a document or set of elements and it transforms those elements so that they correspond to data so that means creating elements or updating them or destroying them and so it's really just that minimal thing the D 3 is trying to do for you and everything else that the 3 does is just sort of the smaller coupled operations that you then compose on top of that and then that also means that the 3 doesn't sort of make in these best practices into the tool itself right it's not deciding how you should you have use make your visualization and of course there's a risk that you will make initialization that is terrible with the 3 and and drive that and that's other people do that and and so in a sense like you now have an additional responsibility which is in your training in in your examples in the other work that you do is to try to communicate as best practices and and but in my view it's better to do that separately and explicitly to teach people those things rather than trying to abstract as best practices by breaking them into tool is if people understand those principles they'll be able to apply the more generally in whatever they're
51:45
doing even if they're not using your tool and I
51:48
believe it's this approach that has really that enabled D 3 to flourish both in terms of
51:53
adoption and also in terms of the diversity of beautiful examples that people have made but they so just to
52:02
rent wrap up and the toolmakers guide so reduce bias by making smaller more flexible tools and if you can avoid bias then at least favored good bias over bad bias and above all else teach users how to be effective brand new fetch
00:00
Point (geometry)
Suite (music)
Euler angles
State of matter
Multiplication sign
Decision theory
Number
Operator (mathematics)
Energy level
Subtraction
Task (computing)
Process (computing)
Rational number
Mapping
Software developer
Projective plane
Mathematical analysis
Virtualization
Bit
Set (mathematics)
Measurement
Sequence
Arithmetic mean
Process (computing)
Computer animation
Personal digital assistant
Order (biology)
output
Right angle
Quicksort
Family
Resultant
Library (computing)
05:05
Axiom of choice
Rotation
Pixel
Randomization
Scientific modelling
Mereology
Weight
Medical imaging
Mathematics
Singleprecision floatingpoint format
Electronic visual display
Color space
Extension (kinesiology)
Rhombus
Area
Email
Mapping
Gradient
Basis (linear algebra)
Interface (computing)
Functional (mathematics)
Sequence
Category of being
Process (computing)
Graph coloring
Software repository
Helix
Order (biology)
Right angle
Pattern language
Quicksort
Uniform space
Data type
Resultant
Row (database)
Point (geometry)
Game controller
Numbering scheme
Vapor barrier
Transformation (genetics)
Canonical ensemble
Distance
Scattering
Field (computer science)
Number
Goodness of fit
Term (mathematics)
Profil (magazine)
String (computer science)
Green's function
Computer hardware
Energy level
Contrast (vision)
Subtraction
Proxy server
User interface
Default (computer science)
Dot product
Scaling (geometry)
Computer
Forcing (mathematics)
Element (mathematics)
Line (geometry)
Cartesian coordinate system
Plot (narrative)
Table (information)
Peertopeer
Word
Spring (hydrology)
Computer animation
Visualization (computer graphics)
Personal digital assistant
Noise
12:56
Point (geometry)
Complex (psychology)
Maxima and minima
Interpolation
Group action
Implementation
Code
Multiplication sign
Real number
Materialization (paranormal)
Weight
Perspective (visual)
Wave packet
Usability
Mathematics
Term (mathematics)
Core dump
Color space
Subtraction
Area
Email
Scaling (geometry)
Process (computing)
Demo (music)
Surface
Sampling (statistics)
Interface (computing)
Bit
Functional (mathematics)
Local Group
Plot (narrative)
Word
Numeral (linguistics)
Process (computing)
Computer animation
Graph coloring
Visualization (computer graphics)
Personal digital assistant
Quicksort
Object (grammar)
Data type
16:38
Complex (psychology)
Code
Multiplication sign
Source code
Mathematics
Mathematics
Plane (geometry)
Analogy
Supersonic speed
Software framework
Physical system
Process (computing)
Product (category theory)
Mapping
Infinity
Bit
Functional (mathematics)
Cognition
Hand fan
Degree (graph theory)
Hausdorff dimension
Order (biology)
Right angle
Quicksort
Pole (complex analysis)
Point (geometry)
Implementation
Vapor barrier
Transformation (genetics)
Geometry
Number
Goodness of fit
Natural number
Representation (politics)
Energy level
Units of measurement
Task (computing)
Projective plane
Polygon
Expert system
Planning
Set (mathematics)
Line (geometry)
Sphere
Table (information)
Radius
Computer animation
Software
Personal digital assistant
Object (grammar)
Lambda calculus
22:45
Trail
Product (category theory)
Touchscreen
Mapping
Polygon
Projective plane
Geometry
Bit
Line (geometry)
Goodness of fit
Computer animation
Personal digital assistant
Quicksort
Film editing
Pole (complex analysis)
24:10
Point (geometry)
Transformation (genetics)
Geometry
Mereology
Spherical geometry
Cylinder (geometry)
Operator (mathematics)
Circle
Subtraction
Physical system
Boolean algebra
Multiplication
Mapping
Projective plane
Horizon
Line (geometry)
Functional (mathematics)
Degree (graph theory)
Radius
Computer animation
Personal digital assistant
Quicksort
Object (grammar)
Film editing
Data type
26:05
Addition
Multiplication
Mapping
Projective plane
Geometry
Port scanner
Sound effect
Line (geometry)
Mereology
Residual (numerical analysis)
Personal digital assistant
Natural number
Quicksort
Film editing
Computerassisted translation
Physical system
26:58
Point (geometry)
Rotation
Geometry
Spiral
Shape (magazine)
Mereology
Pattern matching
Plane (geometry)
Term (mathematics)
Stress (mechanics)
Software testing
Polygon
Expression
Projective plane
Sound effect
Bit
Ring (mathematics)
Line (geometry)
Computer animation
Personal digital assistant
Order (biology)
Quicksort
Film editing
Transverse wave
Arc (geometry)
29:08
Point (geometry)
Complex (psychology)
Pixel
Rotation
Curvature
Multiplication sign
Sheaf (mathematics)
Distance
Thresholding (image processing)
Mathematics
Plane (geometry)
Population density
Operator (mathematics)
Energy level
Subtraction
Physical system
Area
Curve
Product (category theory)
Projective plane
Sampling (statistics)
Line (geometry)
Sphere
Metric tensor
Degree (graph theory)
Uniform resource locator
Computer animation
Network topology
Personal digital assistant
Equation
Vertex (graph theory)
Right angle
Quicksort
Invariant (mathematics)
Linear map
Arc (geometry)
32:42
Point (geometry)
Curve
Greatest element
Product (category theory)
Computer animation
Personal digital assistant
Order (biology)
Sampling (statistics)
Right angle
Line (geometry)
Quicksort
Number
33:25
Point (geometry)
Transformation (genetics)
Multiplication sign
Geometry
Function (mathematics)
Heat transfer
Mereology
Perspective (visual)
Centralizer and normalizer
Sign (mathematics)
Term (mathematics)
Polyhedron
Subtraction
Physical system
Multiplication
Product (category theory)
Spacetime
Mapping
Information
Structural load
Expression
Projective plane
Polygon
Bit
Set (mathematics)
Sphere
Category of being
Word
Computer animation
Network topology
Personal digital assistant
Interrupt <Informatik>
Quicksort
Film editing
Data type
Library (computing)
36:47
Multiplication sign
Distribution (mathematics)
Sampling (statistics)
Sound effect
Bit
Set (mathematics)
Limit (category theory)
Mereology
Weight
Degree (graph theory)
Computer animation
Visualization (computer graphics)
Personal digital assistant
Operator (mathematics)
Order (biology)
Dependent and independent variables
Right angle
Quicksort
Object (grammar)
Data type
Subtraction
Library (computing)
Physical system
39:42
Area
Group action
State of matter
Gradient
Expression
Electronic mailing list
Shape (magazine)
Web browser
Line (geometry)
Mereology
Functional (mathematics)
Medical imaging
Visualization (computer graphics)
Term (mathematics)
Operator (mathematics)
Order (biology)
Energy level
Quicksort
Subtraction
Data type
Library (computing)
Task (computing)
41:21
Variety (linguistics)
Correspondence (mathematics)
Control flow
Dynamical system
Shape (magazine)
Mereology
Disk readandwrite head
Usability
Fluid statics
Latent heat
Mathematics
Representation (politics)
Subtraction
Error message
Condition number
Area
Theory of relativity
Mapping
Projective plane
Interactive television
Category of being
Loop (music)
Computer animation
Integrated development environment
Order (biology)
Quicksort
Scene graph
43:49
Point (geometry)
Readonly memory
Context awareness
Code
Transformation (genetics)
Connectivity (graph theory)
Scientific modelling
Multiplication sign
Schweizerische Physikalische Gesellschaft
Water vapor
Web browser
Mereology
Number
Pie chart
Operator (mathematics)
Representation (politics)
Acoustic shadow
Keilförmige Anordnung
Subtraction
Curve
Standard deviation
Theory of relativity
Spacetime
Surface
Element (mathematics)
Gradient
Projective plane
Geometric primitive
Set (mathematics)
Limit (category theory)
Sequence
Functional (mathematics)
Sphere
Category of being
Arithmetic mean
Radius
Computer animation
Angle
Network topology
Personal digital assistant
Quicksort
Object (grammar)
Data type
Arc (geometry)
47:32
Addition
Mapping
Block (periodic table)
Software developer
Bit
Instance (computer science)
Drop (liquid)
Web browser
Sphere
Functional (mathematics)
Computer animation
Graph coloring
Angle
Personal digital assistant
Representation (politics)
Quicksort
Acoustic shadow
48:31
Category of being
Addition
Standard deviation
Computer animation
Code
Materialization (paranormal)
Representation (politics)
Data structure
Web browser
49:10
Building
Standard deviation
Information
Multiplication sign
View (database)
Element (mathematics)
Materialization (paranormal)
Set (mathematics)
Wave packet
Summation
Kernel (computing)
Computer animation
Visualization (computer graphics)
Operator (mathematics)
Dependent and independent variables
Quicksort
Scene graph
51:48
Goodness of fit
Computer animation
Term (mathematics)
Metadata
Formal Metadata
Title  The Toolmaker’s Guide 
Title of Series  FOSS4G 2014 Portland 
Author 
Bostock, Mike

License 
CC Attribution 3.0 Germany: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. 
DOI  10.5446/31619 
Publisher  FOSS4G, Open Source Geospatial Foundation (OSGeo) 
Release Date  2014 
Language  English 
Producer 
FOSS4G

Production Year  2014 
Production Place  Portland, Oregon, United States of America 
Content Metadata
Subject Area  Information technology 
Abstract  Opening Keynote, FOSS4G 2014, Portland, Oregon 
Keywords 
D3 Data Visualization GIS Javascript 