How I fixed UNIX atime! With 10 lines of code and feminism!!!
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 34 | |
Author | ||
License | CC Attribution - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/38550 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
Bangbangcon (!!CON 2016)25 / 34
3
15
16
17
18
19
24
28
30
32
34
00:00
Line (geometry)Computer animation
00:11
Line (geometry)CodeFile systemControl flowLine (geometry)CodeProjective planeMultiplication signExecution unitJSONXMLComputer animationProgram flowchart
00:42
Line (geometry)Device driverSoftware developerSystem programmingFrame problemShift operatorInformation technology consultingInformation technology consultingPrincipal idealInclusion mapSoftware developerFile systemCodeSinc function
01:03
TimestampMultiplication signTimestampComputer fileMetadataMathematicsComputer animation
01:35
MetadataMetadataTimestampMereologyComputer animation
01:45
MetadataData storage deviceTimestampMereologyTimestampComputer fileData storage deviceComputer animation
01:56
Multiplication signEmailTimestampReading (process)Computer fileComputer programmingComputer animation
02:31
Read-only memoryMiniDiscSemiconductor memoryRight angleMultiplication signDifferent (Kate Ryan album)MiniDiscData storage deviceComputer animation
02:55
MiniDiscState of matterData storage deviceBuildingComputer animation
03:10
Read-only memoryMiniDiscSemiconductor memoryRight angleData storage deviceMiniDiscFile systemRule of inferenceComputer animation
03:33
Physical systemTimestampMultiplication signReading (process)Right angleData storage deviceTimestampComputer animation
04:00
Kernel (computing)Software developerMultiplication signCombinational logicCache (computing)Web pageExecution unitComputer animation
04:27
Entire functionMultiplication signPower (physics)Default (computer science)Computer configurationPhysical systemRight angleHacker (term)ThumbnailComputer programmingBasis <Mathematik>File systemComputer animation
05:12
Multiplication signEmailComputer programmingComputer animation
05:30
File systemSoftware developerRight angleMultiplication signCartesian coordinate systemProgrammer (hardware)NeuroinformatikComputer animation
05:56
Task (computing)Reading (process)FlagCodeFile systemSoftware developerHecke operatorNeuroinformatikForestMultiplication signSemiconductor memoryData storage deviceCodeFlow separationData structureCalculationNetwork topologyElectronic mailing listMiniDiscPhysical systemMoment (mathematics)CASE <Informatik>FrustrationComputer animation
06:59
Point (geometry)Multiplication signComputer animation
07:09
Group actionRule of inferenceLaptopRight angleComputer animation
07:30
Multiplication signTimestampLaptopStandard deviationCartesian coordinate systemEmailComputer animation
07:53
2 (number)Computer fileEmailTimestampComputer animation
08:13
2 (number)EmailMultiplication signComputer animation
08:29
Multiplication signComputer fileRight angleReading (process)MiniDiscOcean currentEmailInformationWritingInequality (mathematics)Computer animation
08:51
Multiplication signEmailRight angleComputer fileElectric generatorMiniDiscSpontaneous symmetry breakingComputer animation
09:10
MereologyCodeLine (geometry)Multiplication signFile systemComputer programmingComputer fileMathematicsOcean currentCodeReading (process)MereologyComputer configurationLine (geometry)Utility softwareKernel (computing)BitMetadataComputer animation
10:15
CodePhysical systemFile systemRight angleMiniDiscCodeMultiplication signSpacetimeOnline helpComputer animation
10:33
MiniDiscRight angleFile systemMiniDiscSoftware developerSocial classQuicksortComputer fileComputer animation
11:03
Product (business)MereologyGodSpacetimeMultiplication signComputer animation
11:32
TwitterNewsletterComputer animation
11:48
Power (physics)FreewareComputer animationXMLProgram flowchart
Transcript: English(auto-generated)
00:22
Projection like four times so it'd be a great time for it to break. I finished fixed Unix a time Which is a 40 year old file systems problem using 10 lines of code approximately and feminism
00:42
So I wanted to start this talk out on the right tone so this is a picture of me drinking a beer while driving a tractor in 2006 while I was working on this code I've done a few other things since then. I was a file systems developer for about a decade I'm now the founder and principal consultant at frameshift consulting where I do diversity and inclusion in technology
01:03
So the first thing you need to know is that every Unix file has three useful timestamps. So The first one is the access time, which is the last time that the file was read also known as the a time These little abbreviations don't really make sense, but let's go with them
01:24
The M time is the last time the file data was written or the modified time And the C time is the last time the file metadata was changed or the change time Thank You NSA for telling us all what metadata is. Thank you EFF for making this great logo
01:45
So these timestamps are all part of something called the inode, which is metadata Thank You NSA and it's written to the disk along with the rest of the file data or to the storage So as an example of how timestamps are used is If you have a mail reader
02:02
Which reads mail stored in a file and a mail receiving program like send mail That appends new file mail to the end of that file You can use the timestamps to figure out if there's new mail So if the access time the a time or the last read time is greater than the or more recent than the modified time
02:21
Then you know, you've already checked it since the last time somebody put mail in the mailbox If the a time is less than the M time, you know, you have new mail. Hooray Okay, so the next thing you need to know to understand Unix a time is that there's a huge difference between Memory SSD and disk access times. So memory is about a hundred nanoseconds to access SSD is about
02:45
16,000 nanoseconds disk is about three million nanoseconds So this is a huge difference, right? So if you want to talk to memory that's like going to the corner store if you want to talk to SSD That's like walking from the Empire State Building to the New York Trade Center or the World Trade Center
03:02
If you want to go to disk, that's like watching walking from New York City to Washington DC and back again Right, so you don't want to go to disk or you don't want to go to storage but disk especially The great thing about reads is that they can often be cached in memory So you don't have to go to the SSD or the disk
03:20
Problem with writes is that they always have to go to SSD or disk eventually There's a bunch of well actuallys in here about if they overlap or if you rewrite them or merge them together But really the thing you need to remember is that the first rule of file system club avoid writes So the problem with Unix a time is that it turns every read
03:42
Into a write because you have to update the last read timestamp. This is a write who thought of that This is a terrible idea. It's like if you wanted to go to the corner store and you ended up walking to Washington DC and back again, right Horrible idea it's been there since 1969 Alright, so here's what Ingo Molner had to say about a time updates
04:05
I thought was great and perfect a time updates are by far the biggest IO deficiency Performance deficiency that Linux has today getting rid of a time updates would give us more everyday Linux performance than all The page cache speed-ups of the past 10 years combined. That's pretty that's pretty specific
04:23
Maybe we should fix this Unix a time thing, right? So, alright, so here's the first non-feminist feminist solution, which is no one gets a time No a time for anyone so you could turn off the a time updates on an entire file system Or you could do it on a per FD basis, but who does that? Nobody does that
04:41
so that that would So the first thing you would do after you installed Linux If you were a power user or you had to do something with high performance You would immediately go and turn a time off on all of your file systems by using the mount option But you couldn't do it by default because it would break a whole lot of programs that depended on a time, right?
05:03
So you had all of the power users turning off their a time haha and packing at CFS tab But everyone else had this really slow system So as an example of a program that says that this would break is this mail reader Right a time greater than M time. No mail a time less than M time
05:22
There's new mail if you have no a time updates. You have no idea There's never any mail. Why doesn't anyone love me, right? So whenever you would complain about this though a Bunch of file systems programmers would sneeringly explain to you how you needed to rewrite your application Not to use a time, right
05:42
and This is true. Every single application that used a time could be rewritten to not use a time we could also rewrite every application to not use file systems or Or we could just throw away our computers and use rocks right like what the heck. I'm a file systems developer
06:01
I want to make stuff for people to use. Okay, so there's the next there are a bunch of other Non-feminist solutions that people propose and they all seem to center the code and the file systems structures So one of the examples was people said, okay Let's just keep a big list of inodes and memory that have updated eight times that we should run it
06:21
We should write out eventually and let's wait to the best possible moment Which is when the system is running out of memory and then we'll write them all out to disk at once It'll be great I mean really all storage likes small random writes and people with that one got shot down When people did calculations and we're like, yes, you could keep storage busy for several minutes doing this. It's a terrible idea
06:41
so in general I felt like there were a bunch of solutions in this vein and I felt like that they were missing the Forest which in this case is what users actually wanted to do with their file system For the trees which were the file system data structures and the code. I find this pretty pretty Frustrating. So here's how I fixed you say time the feminist way
07:03
It remember at this point is 2006 it's been 37 years that people have been trying to fix this problem So the first thing I did is I create I went to a place that had psychological safety in 2006 This was Linux chicks This is a group for women in Linux and some of our rules were Be polite and be helpful and it was amazing because we could have technical
07:25
Discussions without people sneering at each other for their weak use of a time, right? So the next thing I did is I actually listened to people one of my friends her laptop was going slowly I gave her the standard advice Oh turn off a time Of course when she told me that her mail reader didn't work
07:40
I didn't sneer at her and tell her to reconfigure it to use checksums or timestamps or mail der or whatever The solution was that people usually gave her gave people I thought gee that seems like a reasonable thing to want to be able to do in your application The next thing I did is I centered the person I said Well, what is it that my friend is actually trying to find out with her mail reader?
08:00
I thought about it and what she really wanted to know was not the exact second that this file had last been read Timestamps are usually seconds since the Unix epic in 1970 Well, she actually wanted to know was had was the last read before or after the last try What was the a time relative to the end time?
08:21
Because if it's if it's after the end time, there's no mail if it's before it There's new mail. It doesn't matter exactly what the seconds are So this helped me come up with a solution called relative a time And it just asks this question of what is the current a time on the file is the time it was last read
08:40
Greater than or less than the current write time If it's after the current write time, don't bother updating it. Don't send that right to disk We don't really need to know that information. So in the mail reader example If the a times greater than M time already there's no new mail if I go and read that file, there's still no new mail
09:01
It doesn't matter how many times you check your inbox It's not going to magically produce mail like like, you know spontaneous generation, right? So you don't have to write to disk The best part of this I'm so proud of this. It was so simple. I refused to patent it Relative a time was about ten lines of code in the kernel
09:22
There was a little bit more code in the file system utility tools to create the end to create the mount option But this is basically it. This is what got checked in in 2009 yes, it took three years but It asks is the current M time younger than a time if so Update the a time or is the current C time remember we've been ignoring the metadata
09:44
But if the metadata change you want to care about you want to update the a time as well And then this is one thing I didn't talk about yet If the old a time the old last read time is more than 24 hours old go ahead and update the a time again Just so that where there's certain programs with that would check to see if a file had been used in a while and deleted
10:04
If it hadn't been so like in slash temp, there's a bunch of temporary files It will go away unless you have this one extra line of code And otherwise skip the a time update so So yeah in summary. Here's how feminism helped me fix Unix a time. I joined a safer space to talk about technical problems
10:23
I listened instead of mocking or shaming I centered the person instead of the file system code Sorry, and I answered the real question And that is how feminism helped me save billions of disc rights over the last seven years I really think I'm really proud of the solution of all the work. I did in my ten years of working on file systems
10:45
It's what I like the most and I feel like Becoming a better feminist helped me become a better file systems developer And I'm still seeing this happening today people Proposing solutions that are completely myopically focused on the internals of the file system rather than the sort of thing that people actually want to do with their file system, so
11:03
So these days I teach an ally skills workshop in part Workshops in part because I learned all of this how important all of this was to building better products So I teach people how to create safer spaces. I teach them how to listen This is amazing people come to me after the workshop months later and like my god Everyone is saying these interesting things around me all of the time. Do you know what my daughter's saying? Wow, you know
11:26
I teach people how to center the person and a bunch more stuff and sometimes there's lightning No, that never happens in the workshop Plus I'm finally writing a book on ally skills because I want to get it out to more people So you can go to my website and sign up for my news letter
11:41
Or you can follow me on twitter at frameshift LLC. Thank you very much