Hiring Developers, with Science! - TIB AV-Portal

Hiring Developers, with Science!

00:00

3

Formal Metadata

Title

Hiring Developers, with Science!

Title of Series

Part Number

53

Number of Parts

89

Author

License

CC Attribution - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this

Identifiers

10.5446/31519 (DOI)

Publisher

Release Date

Language

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

Nothing makes or breaks our teams like the members they hire. And so we rate and quiz, assign homework and whiteboard algorithms until candidates are blue in the face. But does it work? In this session, we’ll unpack common interviews and how they stack up with research into predicting performance. We’ll learn to design interviews that work, what kinds don’t work at all, and how to tell the difference, with science!

RailsConf 201653 / 89

1

1:11:43

Opening Keynote by Daer

2

48:09

Closing Keynote: Skunk Works

3

39:27

The Rails Boot Process

4

40:04

Make a Rails App with 140 Characters (or less)

5

30:07

How Sprockets works

6

31:31

Writing a Test Framework from Scratch

7

48:19

Opening Keynote by Henry

8

09:45

Ruby Hero Awards

9

43:16

...But Doesn't Rails Take Care of Security for Me?

10

32:39

Will It Inject? A Look at SQL Injection and ActiveRecord

11

41:52

Turbo Rails with Rust

12

40:19

RSpec and Rails 5

13

42:15

From Director to Intern: Changing Careers as a Single Mom

14

36:15

Introduction to Concurrency in Ruby

15

29:45

5 Practical ways to advocate for diversity

16

39:51

Power up Your Development with RubyMine

17

35:16

Stuck in the Middle: Leverage the power of Rack Middleware

18

34:06

Surviving the Framework Hype Cycle

19

28:32

Rails to Phoenix

20

38:23

How we scaled GitLab for a 30k-employee company

21

36:51

Developing and maintaining a platform with Rails and Lotus

22

28:05

Going Serverless

23

30:45

Build Realtime Apps with Ruby & Pakyow

24

40:22

Don't Forget the Network: Your App is Slower Than You Think

25

37:36

Quit Frustrating Your New Developers - Tips From a Teacher

26

38:17

Precompiling Ruby scripts - Myth and Fact

27

27:18

Multi-table Full Text Search with Postgres

28

27:13

Rediscovering ActiveRecord

29

38:06

Saving Sprockets

30

40:50

Inside ActiveJob

31

33:58

The State of Web Security

32

35:42

Implementing the LHC on a Whiteboard

33

51:17

Opening Keynote by Patterson

34

29:27

Finding Translations: Localization and Internationalization

35

29:08

Site Availability is for Everybody

36

23:16

Client/Server Architecture: Past, Present, & Future

37

39:04

How to Get and Love Your First Rails Job

38

33:04

I Can’t Believe It’s Not A Queue: Using Kafka with Rails

39

32:42

From Excel to Rails: A Path to Enlightened Internal Software

40

39:55

Level-up Your ActiveRecord Skills: Learn SQL!

41

39:18

From Zero to API Hero: Consuming APIs like a Pro

42

28:00

Zen and the Art of the Controller

43

30:38

Facepalm to Foolproof: Avoiding Common Production Pitfalls

44

28:46

Packaging and Shipping Rails Applications in Docker

45

37:58

Small Details, Big Impact

46

19:07

How Compose uses Rails to Scale Work, Now Open-Sourced

47

29:47

Crushing It With Rake Tasks

48

35:31

Step 1) Hack, Step 2) ?, Step 3) Profit

49

36:49

Your Software is Broken — Pay Attention

50

25:29

Priming You for Your Job Search

51

34:22

Strong Practices for Rails Applications Continuous Delivery

52

28:16

Excellence Through Diversity

53

32:30

Hiring Developers, with Science!

54

31:05

Internships: Good for the Intern, Great for the Team

55

40:56

Booting Up: Hiring and Growing Boot Camp Graduates

56

32:31

Reduce Small-Team Culture Shock with Agile

57

39:08

ActiveRecord vs. Ecto: A Tale of Two ORMs

58

20:39

Testing Rails at Scale

59

39:17

ActionCable for Not-Another-Chat-App-Please

60

33:41

Rails 5 Features You Haven't Heard About

61

37:36

Turbolinks 5: I Can’t Believe It’s Not Native!

62

43:07

Secrets of Testing Rails 5 apps

63

40:46

Riding the Latest Rails for Charity

64

36:31

Style Documentation for the Resource-Limited

65

20:23

Bridging the Gap between designers and developers

66

31:48

Continuous Visual Integration for Rails

67

35:58

Make Them Click

68

39:15

Real World Docker for the Rubyist

69

27:01

Building Applications Better the First Time

70

34:51

The Guest: A Guide To Code Hospitality

71

28:45

Frameworks for Feedback

72

30:58

Managing Growing Pains: Thinking Big While Being Small

73

27:19

Can Time-Travel keep you from blowing up the Enterprise?

74

1:33:14

RailsConf 2016: Lightning Talks

75

41:10

Pragmatic Lessons of Rails & Ruby in the Enterprise

76

11:13

Sponsor: Indeed

77

42:05

How to Build a Skyscraper

78

38:14

Get a Whiff of This

79

08:44

80

23:08

Postcards from GorbyPuff

81

40:38

82

28:53

Foreign API Simulation with Sinatra

83

42:11

3x Rails: Tuning the Framework Internals

84

30:29

Your First Legacy Codebase

85

40:10

Pat Packet Visits Ruby Rails

86

34:06

Storytelling with Code

87

41:53

Tweaking Ruby GC Parameters for Fun, Speed, and Profit

88

59:28

Closing Keynote by Lamere

89

17:11

How We Deploy Shopify

Automatic playback

Speech

Text

Image

00:00

Set (mathematics)Complete metric spacePoint (geometry)DivisorCorrespondence (mathematics)Type theoryDifferent (Kate Ryan album)Process (computing)2 (number)1 (number)Software testingTask (computing)DigitizingValidity (statistics)Content (media)QuicksortField (computer science)Goodness of fitSoftware developerNumberSelectivity (electronic)Real numberBranch (computer science)Characteristic polynomialGoogolNP-hardMeasurementBitLine (geometry)Figurate numberVariable (mathematics)Information technology consultingRight angleTheoryNatural numberMereologyOperator (mathematics)AreaFood energyPerfect groupCASE <Informatik>Rule of inferenceRing (mathematics)FacebookStatisticsComputer-assisted translationElectric generatorComputer animation

08:12

MeasurementValidity (statistics)NumberEntire functionFigurate number1 (number)Different (Kate Ryan album)Process (computing)Element (mathematics)MereologyConstructor (object-oriented programming)GodResultantDivisorType theoryConsistencyPoint (geometry)UsabilityFeedbackReal numberMultiplication signDrop (liquid)Queue (abstract data type)CausalitySelf-organizationMultiplicationRule of inferenceSet (mathematics)Goodness of fitVariable (mathematics)Form (programming)Electronic mailing listRow (database)ThumbnailMixed realityFrequencyAreaCASE <Informatik>Optical disc driveRight angleMathematicsSemiconductor memoryGroup actionCondition numberWage labourVideo gameFamilyArithmetic meanSystem callDimensional analysisNumbering schemeSoftware developerComputer animation

16:15

Revision controlString (computer science)MereologyType theoryDifferent (Kate Ryan album)WritingBootingNumberCodeTouchscreenTime zoneSoftwareMeasurementScheduling (computing)Validity (statistics)2 (number)Goodness of fitCategory of beingNetwork topologySemiconductor memoryMultiplication signNP-completeParameter (computer programming)Order (biology)Process (computing)Scaling (geometry)QuicksortDemo (music)Point (geometry)BuildingWordBounded variationWebsiteVariety (linguistics)Noise (electronics)1 (number)Software testingSampling (statistics)Commitment schemeSet (mathematics)Service (economics)Mobile appPairwise comparisonCovering spaceCodierung <Programmierung>Right angleReading (process)DivisorVolume (thermodynamics)Inheritance (object-oriented programming)Cellular automatonResonatorInsertion lossLengthLine (geometry)State of matterAlgorithmMusical ensembleSpring (hydrology)MathematicsComputer animationLecture/Conference

24:17

Fitness functionGoodness of fitWhiteboardBitGrand Unified TheorySemiconductor memoryRevision controlSelectivity (electronic)Correspondence (mathematics)CodeMultiplication signCodeSoftware testingAlgorithmFeedbackCircleCausalityCASE <Informatik>Observational studySet (mathematics)Form (programming)Real numberNetwork topologyBootingUsabilityValidity (statistics)Constructor (object-oriented programming)Arithmetic meanProcess (computing)Software bugNP-hardLaptopClient (computing)Point (geometry)Universal product codeMereologyRadio-frequency identificationStudent's t-test1 (number)WordFood energyMeasurementRight angleComputer configurationBit rateView (database)Inverse elementSoftware developerNumberGraph coloringCheat <Computerspiel>Similarity (geometry)SoftwareCategory of beingLatent heatVideo gameProduct (business)Object (grammar)State of matterComputer animation

32:20

Computer animation

Transcript: English(auto-generated)

00:11

Thanks for showing up, I'm excited. This is my first AM talk where people aren't hungover, and so I'm super stoked that all of you are awake. Presumably awake, I'm told there's a lot of jet lag if you're coming all the way from Seattle,

00:22

but hopefully that'll work. My name is Joe Masti, and let's talk a little bit about hiring developers. So I am a consultant, I work in a lot of things with companies, but I tend to work with them on their onboarding, their hiring processes, working with apprenticeships and stuff, and one of the things that I've noticed,

00:41

both with companies that work with me and some that maybe should, is that we have a problem with hiring, and this is a big issue, right? How many of you have, at your job, a job posting that you cannot fill for a developer? Right, there's a lot of us, it was a big deal.

01:00

And these are not, in a lot of cases, this is not just like, oh hey, we could use an extra person on the team, this is like an exigent threat to your business, this is a big deal. And interviewing is hard. Another show of hands, I like show of hands stuff. How many of you have received a terrible interview? Has anyone ever attended, yeah, so you go and like,

01:22

they just don't have their shit together. Does anyone want to cop to ever having given a terrible interview? Okay, I have. Good, I was hoping that somebody would actually cop to that. And I think that this is funny, because even big companies, so you think about the Googles and the Facebooks and all these companies that have 10,000 developers,

01:41

they're not actually doing better. Their interviews are just as terrible as the rest of us. And that, to me, points to the fact that interviewing is in fact difficult. It's expensive, anybody that you have on your interview team also has a full-time job, right? So these engineers who you expect to take hours

02:00

and hours out of their day, also have a complete set of tasks to deliver. Also on the candidate side. So anybody who is applying for your job, they may also have another day job. They probably have other things going on in their life. And if you think that that's not your concern, remember that the candidates that you really want to hire are the ones that probably already have a job

02:21

in other places where they're applying. So if you give somebody 100-hour homework, they're just gonna move on, right? And I think that we're making it ultimately worse on ourselves. We're not really doing any favors. My informal sense of how we tend to get an interview

02:40

is how did I get hired, right? So I've been interviewed in a bunch of different ways, that would seem kinda cool. Maybe we'll do that one, or maybe we'll try something else. So it doesn't do us any favors whatsoever. And I'll tell you now, there is no perfect interview. So we're gonna talk through a lot of things that make interviews better or worse, but there is no correct, per se, answer.

03:02

I will say that there are a lot of bad answers. And the bad answers are the ones that, in large part, we're doing right now. And the results of that, my clicker doesn't work, yes, is that this happens. So this is one of the main contentions I'm gonna make to you. I wanted to put it in early

03:21

because I want you to think about this. The reason that our interviews are bad is typically that we are not measuring what we think we are. When you have a bad interview, when you have puzzles, when you have abrasive interviewers, you're not measuring the candidate,

03:41

you're measuring the interviewer. And the entire point of the interview, obviously, is to see whether that candidate is good. And so this is ultimately why you end up turning down good candidates. This is ultimately why you end up accepting bad candidates. This is why you have 200 interviews and never offer anybody a job. It's because you're not measuring. And I don't think that this is on purpose, right?

04:00

Nobody's bad on purpose. What's happening is that we don't have a tool set. We don't have a mental schema for how to evaluate if our interviews are any good. We tend to be engineering types. We're not coming from an HR background. And so usually it's that made it up kind of thing. So good news,

04:21

even if we have not invented correct ways to do interviewing, there are other fields that have, specifically ones that have been around a lot longer than we have, psychology. And what I want to talk about today is industrial and organizational psychology. So this is one of the major branches of psychology. It started in the 1800s, late 1800s. It really came to prominence in the 1920s,

04:42

which is during the First World War, what happened was psychologists in the army needed to figure out where to place a million recruits, literally one million recruits. And so they needed to come up with a way to handle that process. And so they started to codify what they call selection. So I'm going to include, at the end of this,

05:00

there is all the references and there is a lot that you can look up. If you do find yourself wanting to look at primary material, selection is the name of the concept that you want to Google. Cool? And so it's going to be a little bit tough for me to cover 100 plus years of psychology. Unfortunately, they have written a lot over the course of five generations, as it turns out.

05:22

But so what I want to do instead is I'm going to give you a tool in three parts, a way to think about the interviews that you're doing, and then we're going to cover a couple of the common kind of tropes, the things that we tend to see in interviews, and look at them through that lens. Cool? I like you.

05:40

So number one is validity. So if we have something we want to measure, if we have a construct, is what we call this in psychology, validity tells us whether our measurement measures that thing, right? These bullets do not, they're bullets, I promise,

06:00

do not very closely align to each other, but they are all centered approximately on the bullseye. That's validity. It's okay that they're spread out. It's okay that, in fact, most of them are wrong because they are measuring the correct concept. And there's a couple different factors to validity, things that I want to consider while we're here. One of them, one type of validity,

06:21

is the question of whether the thing that, the question that you ask corresponds to the concept you want to test. And so if, let's say I wanted to test whether you know arithmetic, if I ask you to list off the digits of pi, is that a test of your arithmetic abilities? No, right? If I ask you to do five plus five,

06:40

may not be a great question, but it is, in fact, arithmetic, right? And the second type of validity that I want to talk about is a sort of wider one. Given that I can test your arithmetic, does that correspond to a skill that I need you to have? I can test your arithmetic, but if the job that I'm trying to hire you for is carpenter, is that actually a valid skill for the job?

07:07

So it's called external validity, is the name of that one. And you'll notice that all the concepts of validity talk about a construct that you want to test. You have to know what the bullseye is. And so this is actually our first wrinkle

07:20

when we come to hiring of developers. Because as it turns out, we probably do not agree on what makes a good developer. There is a lot of complication to our field. And so as it turns out, it's very difficult for us to say what success even is in this sense. So think about what a great developer would be in your terms, right?

07:41

Hopefully it doesn't look like this. Maybe, maybe not. But what it probably does look like is people that you know or people on your team who have been very successful. And you think about a bunch of characteristics of those people who you've seen who are successful and you kind of generalize and say, okay, that's a good developer.

08:01

But it's not, that is not a real concept of good developer. What that is is kind of a bag of characteristics. Some of them may actually relate really well to whether somebody's a good developer. Some of them may not at all. And so one of the things that happens is when we start to measure people based on what we've seen from success, we end up measuring all these things

08:21

that we didn't intend to and we get that. And we get more of that and that's what our entire team becomes. Reliability is concept number two. So validity is whether we're centered on the bullseye. Reliability is how close our measurements are to each other. So in this case, we don't even care if it's centered

08:41

as long as the measurement comes out the same. And so just like validity, there are a couple different concepts in here. One of the big ones that's really important in technical interviewing is that if I give you an interview and if somebody else gives you that same interview, you should get the same score. If you don't get the same score, you're not measuring the candidate,

09:01

you're measuring the interviewer, right? That includes if I interview you and then another day I were to interview you but I'm pissed off. These things should not impact our measurements but they do. That's called inter-rater reliability. A second one is if I take an interview once and if I take that interview a second time, I should get the same score.

09:21

This is called test-retest reliability. And what that means is that if there's an element of chance, if there's an element of, did I happen to follow the one path or the other path and it totally changes my score, again, I have not measured me, I've measured the instrument or I've measured the random chance that I took A instead of B. Then we're hiring on a dice roll, right?

09:43

And then the third type of reliability, we're not gonna see a ton of this but if I were to have multiple questions, those questions need to yield the same result. So if we go back to our arithmetic example, if I ask you, what's five times five? Thank you, some of you know arithmetic.

10:01

What's eight times eight? Thank you. What's 265 times 12? Clearly you don't know arithmetic, right? It's the same form of question. But what happens is we have these questions where there's some other confounding variable. In this case, we've all memorized a very particular set of multiplications

10:21

and we're not actually doing them in our head, we're just doing them by row. This happens all the time in interviews. We measure a construct that some people have memorized, other people have not. And all those concepts of reliability, I think there's an interesting one, point towards approximately the same thing. They point towards consistency. So we're not gonna belabor this

10:41

in the rest of the talk, but I wanna say that if you give the same interview, if you give it the same way, if people have a scoring rubric so that it doesn't matter who's there, what matters is how you do, right? If that scoring feedback is objective, you will tend to be reliable.

11:01

So reliability, consistency, I'll say consistency breeds reliability. So number three, usability. I held up two fingers for three. We could probably come up with a valid, reliable interview for developers, and what it would look like is having you do one of everything, right?

11:21

If I were to do this in my arithmetic, I would just ask you one of every single question. But of course it doesn't work, right? And it's tempting sometimes in our interviews to be able to just smash enough things in that we get that accurate measurement, but of course that doesn't work. It's exhausting, people don't wanna go through it,

11:40

and ultimately what we have is a really limited opportunity to take measurements from people. So we need to be really careful about how our usability works. And this differs between company to company. So one of the things that's really tragic about stealing the Google interview, if you take Google's interview and use it for yourself, they can abuse their candidates and people still wanna go work for Google, right?

12:02

It's true, it's kind of a shitty process, but it works for them because people don't drop out. That probably doesn't work for you. And this is different between candidates as well. So imagine we give homework, let's say it's a 20-hour homework, right? Somebody that takes a bunch of time. Some candidates, if you don't have a job, let's say you just graduated a bootcamp,

12:21

let's say that you just left your job, fantastic, no problem. If you're somebody who has a real job, or God help you, if you're somebody who has a family, if you have other commitments, if you've ever had a medical issue, this is now what you're measuring. You didn't mean to measure that. You didn't mean to just exclude anybody who'd ever had a family, but that's what's happening. So we need to keep usability in mind.

12:43

So to throw a couple other confounding factors, your target probably does not look like this, or in reality, nobody's target looks like this, but your target looks like something else, and that is because your requirements, the things that you do, the constructs you need to measure are different for you than they are for everybody else, right?

13:02

And so you cannot just take somebody else's interview. And when you're thinking about these interviews, you can't simply say, okay, we'll just maximize one dimension, right? I can tell you approximately how valid different types of interviews are, but you have to balance that against the other factors. You need to say, should I trade off the ability for reliability?

13:20

Because the correct answer is often yes. And then in reality, bringing back to the usability, we get this kind of thing. Three concepts, pick two-ish. Probably really pick like one and a half. So there is no perfect interview. You're not gonna get something that is off to the top right here. You're gonna get something that's messy and dirty,

13:41

and what that means is that the only way for you to tell if that interview works is to test your interview. You must test your interviews. And what that looks like, if you have an existing team, this is actually nice, you can give your interview to your existing team, assuming you like them. If you don't like them, just invert the results.

14:04

But assuming you like them, you can give the interview to your own team. But that's also not good enough. And this is why it's so hard to do stuff like measurement is because if you have, let's say that you manage to come up with a team that's all right-handed, if you now create an interview that happens to be very difficult to complete with your left hand, your entire team will succeed.

14:22

Huzzah, we are accurate and valid in all those things. But again, we've managed to measure something that we don't intend to. So you need to go out and you need to test your interview against people who are not part of your existing team and who are not part of the kind of experience and demographics of your existing team. Cool? All right, let's figure out how to use these tools.

14:42

I don't know if that's an ax or a hammer, but it's probably an ax. So we have the tools, we have reliability, we have validity, and we have usability. Let's talk through the interviews. So I'm gonna cheat, this is not part of the interview, but it's my talk and you can't all leave fast enough. So ha.

15:02

The interview process really does start back at the job posting. This is good and this is bad. The reason I wanted to bring this in here is because if you, again, if you have, if you accidentally exclude a bunch of people, if your job posting causes nobody with left-handedness to apply, then nothing you do in the rest of the interview

15:20

will ever fix that, right? They're not even in your queue. So let's talk about what you need. This is the opportunity. So we talked about constructs and we talked about how you can't just steal them from somebody else. The job posting is an opportunity to think about the things that make somebody successful for your organization. You need to think about what things

15:42

you actually wanna measure. And a good rule of thumb here is that if you're putting it as a real requirement, you should probably actually measure it during the interview. If you have something you're not gonna bother to measure it, you probably don't need it. You need to prioritize these things because everybody is different and everybody is flawed.

16:01

And so in reality, if you have a list of six things that you really need, pardon me, you probably really only need four or five of them and if somebody comes in, you may wanna hire them anyway. And then again, going to need versus want. I want somebody who's great at testing, I want. Somebody who's great at refactoring, I want. Somebody who can scale my services and who can scale

16:20

everything and scale buildings, but I don't need that. And so remember that many people have been socialized not to apply for jobs that they don't qualify for. So the more you put on the need, the more you exclude people. I happen not to agree with applying for jobs you don't qualify for, but that is the reality of what it is. What are you asking for? The actual text of your posting is really relevant.

16:43

Have any of you ever declined to apply for a job or discontinued applying for a job? Because you saw a posting that wanted some variation of the ninja unicorn Jedi, right? Is anybody? So I have neglected to apply for places just not interested, right? Do you think that they meant to exclude me?

17:03

I'm a ninja, yeah. But the reality is that that happens and so the words that you use, what you actually ask for matters. In the resources, I'm gonna point to two different things, two resources that I like to use for this. One of them is called Textio. What they do is you put your posting in there and it kinda tells you these are corporatey words

17:22

that people tend not to do very well, right? And that can help you. And the other one is called Joblint. And so that's one where, again, if you've got these kind of ninja rock star, we're gonna go crush some code, it can point out a lot of those things to you that you may not have considered in the past.

17:40

So you need to be careful about what we are asking for. And then we need to think about where we are asking. If the only place that you post your job is Carnegie Mellon, where, by the way, you are like the 50th most interesting startup, right? Your team is gonna reflect precisely one background. It's gonna be CMU.

18:01

That's not good. We need to have a variety of depth. The same thing goes for your network. If your network is relatively homogenous, if you all tend to come from the same place and do the same things, that is not gonna create a sufficient candidate pool for you. You need to think about where you are posting jobs, and you need to reach outside of that comfort zone to find more people.

18:20

And ultimately, this is a good thing for your team. It's the initial screen, we get some resumes. Let's talk about the different types of screeners that people tend to get. These are like, trivia, I think would be the category you would call this. If you wanted, if you wanted like a litmus test, if you can Google something in 30 seconds,

18:41

or it'll take you like 10 seconds in IRB, it's probably a trivia question. So, let's think about validity here. In theory, well maybe not this question so much, but there are questions that are trivia that you could ask that might be valid. The problem is that there's not very much signal. Usually if I ask one thing like, what's the name of this method,

19:01

or what's the order of arguments of this method, it's very little data. And what it ends up rewarding, what it ends up measuring instead is recency. Have you dealt with it lately? So, a lot of times the way this interview works is that the director or the CTO reads something on Reddit, and then they go have a fun interview, and they're like, oh hey, do you know this thing?

19:21

What is the memory footprint of this string thing? Is that valid to the job? No. A minute ago you didn't know it, and you were fine. So this is not real valid criteria. Could it be reliable? If you ask the same question, I think it could be reliable. Do people ask the same trivia question?

19:42

No. They mostly get it from whatever they were thinking about last. So in practice this is rarely reliable. Is it usable? Well one trivia question's very easy. I'll say, I think that you could get a more valid version of this by asking like 50 of these. I think that with enough trivia you might be able to get something that resembles a valid question,

20:01

but then that's not usable. This is not a very good one. Do you know what this is part of? Very common phone screen. This Fizzbuzz. I think, Fizzbuzz is interesting. So jumping in, valid? Is it valid? Maybe.

20:21

There is an aspect to it that says can you write some amount of code? It does suffer from the problem where people tend to study for Fizzbuzz, right? Like if you are a boot camper, you absolutely are learning to write Fizzbuzz cold. But it does track some kind of concept. Is it reliable? Yeah, it's probably reliable. I can administer it, I can think about

20:41

the different types of things that people get right, get wrong, I can score it. Is it usable? Yeah, it's actually pretty usable. I think ultimately that's why people use it is because it's very easy to administer. If we wanted to make this better, we would probably want to change it from one that's really well known. Like I said, everyone knows Fizzbuzz. If you're looking for a job, learn Fizzbuzz cold.

21:01

You've now passed 30% of phone screens. Good work. But I think if we changed that, we would have something that's a little more valid. Homework. Who issues homework as part of their hiring process? By which I mean, go write some code on your own time and submit it. Fair number of people. I think, again, homework is an interesting one.

21:20

Homework is super valid. Homework is a work sample test. So in the terminology of IO psych, probably the most valid way to, or the most predictive way to look at somebody's work is to have them do the work. As a work sample test, homework works for this. But it has significant problems. The reliability is an issue here.

21:41

Because what you have is candidates, some of whom can spend 20 hours and some of whom can spend five hours. And if your grading criteria don't take this into account, you end up with a really, really different set of scores for people. And you did not intend to measure, again, whether I have commitments at night. But you did. And so the way we can fix that is maybe to put some parameters around it.

22:02

Hopefully you give them an assignment that's related to your work. Goes to validity. Hopefully you give them an assignment, you say, spend about five hours. Could somebody cheat? They could. But it gets you closer to having an accurate baseline comparison. And hopefully you don't give any homework that is like, please re-implement our whole app. Or please work on this NP complete problem,

22:21

or this thing that our software architects can't even solve. But like, show us a working demo. Like sometimes we have this tendency to just like, ah, that's a thing I was thinking about. Doesn't work. And then, I'm not gonna spend a lot of time on this one, but something I've seen a lot of recently is these sort of sites where they promise to give you a score. And so you go there.

22:40

And I think it's super, super usable because it doesn't take any engineer time. And it may even be reliable because you tend to ask like the one question, right? Or a handful of questions are the same question. But the question of validity comes up here. And I think that this is ultimately where these become problematic is that the questions in their question bank, because they need to be auto-gradable and because they need to generate this big volume of them,

23:02

tend to have very little to do with the actual business of building software. Is that fixable? Probably. Have I seen it yet? No. So interview day. A year ago, Carrie Miller gave a talk at RailsConf about hiring. Problem solved. It's actually a really good talk. You should go watch it. And there are a lot of things

23:20

about the interview day that she covers. The big ones I wanna cover right now is really minimizing variance. So anything that differs between candidates ultimately is going to give you extra noise that you're not measuring. What that means is that you should have a schedule and it should be a consistent schedule. You should know who your interviewers are. And your candidate should know that as well.

23:40

Your interviewers need to be trained. They need to understand what it is that they're doing. They need to have a scoring rubric. And I'm gonna contend here that your candidate should probably know what you're measuring. Because if you're being sneaky about it, it's probably a stupid question. Right? So if you can do those things, tell them what to expect. You're gonna put them more at ease. You're gonna get more signal.

24:00

First thing they go into, code writing. Has anyone implemented a, we'll say, a red black tree at work in the last year? No? No hands? Okay, good. Don't do algorithms. They look like code. They look like code we use, but in reality,

24:21

that's not how we build software. Nobody ever implements red black trees from memory at work. Right? So there's a problem. They feel valid, but they're really not because the correspondence is very low. Are they reliable? No, not really. They have some of the same recency bias.

24:40

Somebody who studies, somebody who just graduated CS is probably gonna remember this better than somebody who's got a bunch of years of experience. And so great, now you have an inverse selection. You select for inexperience because those people remember algorithms. Good work. Usability, yeah, that's probably pretty usable. Right? I think a better version of this would be if you were to take an algorithm, like don't pick something with somebody famous as name,

25:02

Dijkstra is anything or any of that. Pick an algorithm that you don't really, that you make up and have them implement that from a sheet of paper. Say here's the algorithm, here's your laptop, do that. This defeats the recency bias. This is an actual test of what we do for software, which is translating requirements into working code.

25:21

But don't give algorithms. The even worse version, whiteboard coding. Everyone's expecting me to hate on whiteboard coding, and I will, cause it's dumb. If at your job you're required to whiteboard code, not if you have the option, not if people tend to, but if you're required to whiteboard code, please by all means go ahead. If not, what you are measuring is a skill

25:42

that people don't use unless they are practicing interviews. In that case, just give them a fucking laptop. It's really not that hard. And then the live bug code. Does anyone do this?

26:02

Wait, don't raise your hands. It's technically illegal. If you're not paying your candidates, you can't have them work on production code. In most states. Check your local states. So the cool thing about this, validity-wise, this is like 100% valid, right? Because this is the work. Like you could not get more valid than this,

26:21

and that's cool. But the problem becomes the reliability. If you're working on real code, you generally can't repeat the same problem. Or if you do, you have serious problems and shouldn't be hiring. So usually what that means is that we have this trade-off. Either I spend a ton of time in my backlog finding things that are similar-ish, in which case it's not usable,

26:41

or I just kind of pluck something out, in which case it's not reliable. Remember that parallel form is reliability. But also it's illegal, so probably you shouldn't do it. So now that they're exhausted, they've written some code, let's do some problem solving. Does anyone know what interview question this is? I actually got hired on this once.

27:02

It's dudes, they're buried up to their necks, and they've got hats on, and you have to tell who's wearing what colored hat, and if they do that, then they don't get killed or something. This is just dumb. Let's talk about validity. No. Let's talk about reliability. No. Let's talk about usability.

27:21

Yeah, great usability. Awesome. But you've measured nothing. Not only that, but your candidates are probably looking this up. Again, if you're a boot camper, if you're looking for a job, just go look up the six or seven of these that everybody uses. Learn them cold, you're fine. Pretend that you're having a hard time with it, and then come up with a trick. Ta-da!

27:43

A little bit better, case studies. Given a hypothetical, how would you deal with it? This can be valid. You can use your existing work. This can be reliable. You can give the same case study based on historical precedent. You need to modulate a little bit. If you have a senior developer, and you are giving them, say,

28:01

some architectural problem that you can't solve, it's an issue. If you have a junior developer, and you're giving them, effectively, any architectural problem, that's not something that's in their skill set. It's probably not what you mean to measure. But even better, this is my favorite kind of interviewing. Behavioral interviewing. Has anyone ever seen this? So this, it takes the same form every single time. It's tell me about a time when X.

28:22

And the reason this works is that, despite what they have to say in the financial sector, past performance absolutely predicts future performance. Absolutely does. And so, this is a great interview, insofar as you can test somebody's real experience. It's valid. Yeah, I know. Great work.

28:42

It's reliable. You can test the same thing over and over again. And then usability. There's a little bit of a challenge. It turns out that to get good answers out of people, you have to train them on this kind of interviewing. But that's okay. You can overcome this. And then culture fit.

29:00

If your culture fit looks like everyone I've seen before, it's your CTO or your director going in, shooting the shit for about 40 minutes, and then deciding yes or no, right? So you've had this before. You're measuring for people like me. There's no criteria, there's nothing. There's a gut feel. And the way the gut feel works is that it takes into account every one of our preconceived notions.

29:22

So it is, again, almost zero validity. My answer to you should be, just don't do these. But you're gonna do them anyway. So if you're not gonna listen to me, at least think about the things in your culture that do cause people to be successful. We ship first all the time. We ship the highest quality all the time, right? We teach everybody.

29:41

Or we're independently capable. These kind of things that actually could measure success and at least measure those. But better off to still use them. So you send the person home. Now that they're gone, no more problems, right? So the way the debrief ends up working all the time is, of course, we get into a circle

30:02

and then everybody does. It's kind of like when you do rock, paper, scissors as a kid, you're like, rock, paper, scissors. And people read each other, right? They read each other socially. This is what happens. Or you end up talking to the person on my left. I think this person's like an eight out of 10. They're like, yeah, no, that was the worst. You go, yeah, hmm, it's six.

30:20

Because I'm like, well, I don't wanna be an idiot. So the way that we fix this, we write it down, first of all. Have your interviewers write down specific objective feedback. Not, she was cool, but she missed the test coverage issue in this question. She solved the problem with 10 minutes to spare.

30:40

And then share all that feedback at once so that nobody can cheat. Make sense? Cool. So, recap. If you want to hire well, welcome, time for recap. If you wanna hire well, you need to pick a set of constructs and you need to design and test

31:02

interviews that are valid, reliable, and usable. Cool? So there's one more thing I wanna talk about before we actually go. And the reason is the point at which I submitted this talk. I was having an issue with a couple clients where they had teams that we'll say

31:20

were more than a little bit homogenous. And so I talked to them about their interview process. And what they always said, and this is feedback I get all the time, is we only wanna hire people. We only hire the best. We don't wanna lower the bar. And I hate this, right? But I couldn't tell you precisely why, like I couldn't give you the reasoning. I know it's wrong, but I couldn't give you the reasoning. And I understand the reasoning now

31:41

as part of this talk. And the reality is that for a lot of us, what they believe, they're not bad people, but they believe they have this bar and you're over it or you're under it. And in that sense, of course you don't wanna lower it. But the reality is the bar doesn't look like this. The bar is like weird and tilted and fucked up because of all these extra things that you're measuring that have nothing to do with job success.

32:02

And so I'm gonna say, we should probably actually raise the bar. Most people's interviews are not nearly as tough as they think they are. But to do that, first we need to make sure that the bar is straight. Cool? Thank you.