Taking the Pain Out of Support Engineering
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Serientitel | ||
Anzahl der Teile | 88 | |
Autor | ||
Lizenz | CC-Namensnennung - Weitergabe unter gleichen Bedingungen 3.0 Unported: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben. | |
Identifikatoren | 10.5446/37339 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache | ||
Produzent | ||
Produktionsjahr | 2018 | |
Produktionsort | Pittsburgh |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
|
RailsConf 201821 / 88
9
14
16
19
20
22
23
26
27
28
34
35
36
37
38
39
41
42
46
47
53
57
60
62
63
64
69
72
80
85
87
00:00
COMSoftwareSoftware EngineeringKontextbezogenes SystemTypentheorieRahmenproblemDiagrammComputeranimation
00:47
Prozess <Informatik>TypentheorieDienst <Informatik>Fahne <Mathematik>BitComputeranimation
01:30
SoftwareentwicklerProzess <Informatik>Güte der AnpassungComputeranimation
02:20
SoftwareentwicklerSoftwareentwicklerQuick-SortComputeranimation
02:56
Kontextbezogenes SystemRFIDRandwertProzess <Informatik>BitMereologieRandwertAnalytische FortsetzungQuick-SortKontextbezogenes SystemStrömungsrichtungInhalt <Mathematik>ProgrammierungWeb Site
04:33
SpieltheoriePerspektiveExpertensystemBesprechung/Interview
05:08
Rechter WinkelFAQSpieltheorieMailing-ListePerspektiveMultiplikationsoperatorFAQSoftwaretestCASE <Informatik>
05:55
Offene MengeVirtuelle MaschineMessage-PassingComputerFAQMultiplikationsoperatorInhalt <Mathematik>FAQTermBeobachtungsstudieSpieltheorieQuick-SortCASE <Informatik>XML
07:07
MereologieCASE <Informatik>SoundverarbeitungRuhmasseSpieltheorieSoundverarbeitungRuhmasseCodeFigurierte ZahlSystemaufrufDatenflussInhalt <Mathematik>Besprechung/InterviewComputeranimation
08:07
SpieltheorieInhalt <Mathematik>Rechter WinkelVolumenTelekommunikationGeradeVideokonferenzDatenflussMultiplikationsoperatorSpezifisches VolumenKugelkappeQuick-SortGeradeInhalt <Mathematik>Produkt <Mathematik>SpieltheorieVorlesung/Konferenz
09:27
Stetige FunktionAnalytische FortsetzungPhysikalisches SystemWarteschlangeQuick-SortBitProgrammierparadigmaTypentheorieSpieltheorieComputeranimation
10:20
GruppenkeimGruppenoperationExogene VariableSpezifisches VolumenQuick-SortMusterspracheTypentheorieTermKreisbewegungComputeranimation
11:13
Stetige FunktionGebäude <Mathematik>ComputerspielStrategisches SpielKreisbewegungPlastikkarteRechter WinkelQuick-SortOffene MengeProgrammfehlerSkriptspracheKreisbewegungSoftwaretestExogene VariableSystemaufrufWeg <Topologie>TypentheorieAnalytische FortsetzungMultiplikationsoperatorMusterspracheProzess <Informatik>PunktComputerspielStrategisches SpielResponse-ZeitOrtsoperatorTermBitAutomatische HandlungsplanungCracker <Computerkriminalität>Physikalisches SystemFrequenzGebäude <Mathematik>CASE <Informatik>SoftwareentwicklerProdukt <Mathematik>Auflösung <Mathematik>
19:17
ClientMereologieUmwandlungsenthalpieCASE <Informatik>Gebäude <Mathematik>SoftwareentwicklerAuflösung <Mathematik>ProgrammfehlerMultiplikationsoperatorMusterspracheEinfügungsdämpfungBitProzess <Informatik>Besprechung/Interview
22:34
Auflösung <Mathematik>ClientTypentheorieExogene VariableBitComputeranimation
23:11
ObjektverfolgungStetige FunktionComputerspielRFIDRandwertMereologieClientExogene VariablePufferspeicherIntegralQuick-SortKartesische KoordinatenPufferüberlaufKeller <Informatik>FehlermeldungProgrammierungPerspektiveAnalytische FortsetzungCodeHilfesystemResponse-ZeitZeitzoneRandwertBitSoftwareentwicklerTermTypentheorieMultiplikationsoperatorQuaderExogene VariableClientSchreib-Lese-KopfRichtungProzess <Informatik>UmwandlungsenthalpieInteraktives FernsehenPuffer <Netzplantechnik>RahmenproblemDifferenteOrdnung <Mathematik>SummengleichungGerade
30:47
RFIDClientRandwertSoftwaretestEuler-WinkelFormale SpracheKontextbezogenes SystemComputervirusAdressraumSoftwareentwicklerInteraktives FernsehenMultiplikationsoperatorBitTypentheorieGüte der AnpassungSoftwareentwicklerE-MailFormale SpracheFehlermeldungRankingReelle ZahlGanze FunktionBenutzerbeteiligungAdressraum
33:11
DatenverwaltungDruckspannungProdukt <Mathematik>XML
33:52
Euler-WinkelFormale SpracheKontextbezogenes SystemComputervirusAdressraumSoftwareentwicklerMultiplikationsoperatorTypentheorieSoftwareentwicklerVorlesung/Konferenz
34:41
Prozess <Informatik>TaskSoftwaretestProzess <Informatik>TaskComputeranimation
35:14
TaskFAQWechselsprungProzess <Informatik>TaskSurjektivitätWarteschlange
36:08
WikiWeb-SeiteMultiplikationsoperatorWellenpaketOrdnungsreduktion
36:40
Prozess <Informatik>SoftwareentwicklerOpen SourceLoginLeistung <Physik>Web-SeiteVersionsverwaltungInzidenzalgebra
37:56
MagnetkarteSoftwaretestResponse-ZeitOpen SourceSoftwaretestInstantiierungBenutzerbeteiligungSoftwareentwicklerBinärdatenWeg <Topologie>ResultanteProdukt <Mathematik>Prozessautomation
39:03
SoftwareentwicklerGüte der AnpassungProdukt <Mathematik>Quick-SortMusterspracheSelbst organisierendes System
40:03
Twitter <Softwareplattform>FrequenzGüte der AnpassungComputeranimation
40:35
DatensatzCOMp-BlockDatentypXMLComputeranimation
Transkript: Englisch(automatisch erzeugt)
00:13
to taking the pain out of support engineering. My name is Tessie Correa, and I'm a software engineer over at Context.io.
00:23
Mostly what I do when I'm talking about support engineering is I handle support for a publicly available API, so typically my support users are other developers. So that's mostly the frame in which I'm basing this talk around of,
00:40
but hopefully a lot of the stuff that we'll talk about today is also applicable to other types of support teams. Before we get started talking more about support, I want to start a little bit with a story. Back when I was in college, I got a job working at an amusement park back in Houston, Texas called Six Flags AstroWorld.
01:05
And if you've ever worked for an amusement park, they have a crazy customer service oriented type of work culture. So one of the biggest things I took away from that job, and honestly it's been a while,
01:21
so I don't remember much about the job itself, but I do remember one thing that they taught me that they ingrained in everybody, and that's if you don't know the answer to something, don't just walk away from a customer. Say, I don't know, but I'll find out for you. And that's something that I heard back when I was,
01:43
I don't know, maybe 19 years old, college student, and that something has stuck with me pretty much throughout every job I've ever had. And I feel like that mindset of, you may not know something, but I'm gonna go find out for you, is at the heart of what makes a good support engineer
02:02
or a good support engineering team. So what is a support engineer? I've been talking to people throughout the conference and whenever they ask me, what are you doing here? I'm giving a talk on support engineering. Oh, what do you mean by support engineering? A lot of people have different definitions.
02:20
So for me, a support engineer is a developer or some sort of technical person that's providing technical support to other developers or maybe end users or maybe within your company. So you're providing some sort of support internally to other developers at your company. So it really depends on the size of your company
02:43
and how you currently handle support. Again, for me, support engineering is supporting other developers or integrating our public API. But again, it might be different for your company depending on your setup. Why am I passionate about support engineering? Honestly, support has a bit of a bad reputation
03:04
when you talk about support in tech. But I'm actually really proud to work in support and I really love it and I enjoy it. I think this is because twice in my career, I've had first jobs where my job was to support other people. How can you have two first jobs? That doesn't sound right.
03:21
My first job ever out of college, I worked for Electronic Arts, I worked for their customer support division, mostly writing content for the website. And then years later, I decided I wanted to go into programming and my first programming job where I had some sort of engineer in the title, I was a support engineer for Context.io.
03:42
It's currently also my current job but I've sort of been promoted since. So twice in my life, I've had sort of like that title as like my first way into an industry. So I'm really fond of support. So let's talk about some support engineering best practices. Today, we're gonna learn how to think critically
04:01
about problems. We're gonna learn how to prioritize our relationship with your support team. We're gonna learn about continuity, specifically business continuity of your tickets. We're gonna talk about ownership of your tickets. We're gonna talk about boundaries because that's actually something that we don't often talk about when we talk about support and I think it's really important to make a happy team.
04:22
And then lastly, we're gonna talk a little bit about some tools that I cannot live without for my support life. So part one, thinking critically. And this is when I'm going to put on my hat of having worked for Electronic Arts for a while. I was assigned The Sims 3.
04:41
Yes, this was a long time ago. We're The Sims 4 now. And I was assigned to be the subject matter expert for Sims 3 and this meant that I got to play the game a few weeks early and I played it for maybe about two weeks straight at work and I was going at it from the perspective of let's try to find things that are going to confuse end users
05:01
and let's try to preemptively write content that's going to help end users overcome those issues. So what the studio did right in this perspective is that they gave support early access. It doesn't actually always happen. In the year or so that I worked at EA Support,
05:21
we didn't always get a build of the game that we were supposed to write content for. And sometimes some studios would even give us like a book list and that was super helpful but that wasn't always the case. So a lot of the times we just had to make do with what we were given. And this studio was really nice in that they actually did give us a build of the game
05:43
for us to test prior to launch. They gave us that time to test before the launch. They gave us plenty of time and that gave us the time that we needed to preemptively generate FAQs. I actually looked today and I saw one of the FAQs that I wrote back during the launch of Sims 3.
06:03
It's still up there. And that's what happens when you give your team the time and the tools to write good content. Good content sticks around. So as you can see, I wrote this about eight years ago and it's still up online so I'm really proud of that. And this taught me that if you remotely think
06:24
that something could be an FAQ, it should be an FAQ. And what this whole sort of case study around Sims 3 and giving the support team early access and giving us the time to really go through the game
06:40
and think critically, that's something that you need for each launch that you have. You need to give your support team the time to look at what it is that they're supporting and put on their thinking hats and start thinking about in terms of is this going to be a question? And that's not something you can really teach. That's something that you can only learn by doing.
07:02
So giving your team the time that they need to be able to get to that content is really critical. So this was a successful launch again for those reasons. Let's talk about a not so successful one. And this is where I learned that prioritizing a relationship with support is really important.
07:22
Mass Effect 2, great game. However, we had an issue where people who purchased the game through a pre-order were given a code to get DLC, downloadable content. Once the game was released.
07:42
For some reason, it was really difficult or it just was a very confusing flow for people to redeem this DLC. So this launch generated a huge amount of calls and emails to our support center from gamers who couldn't figure out
08:01
how to redeem their DLC code. And this is something that would have been avoidable. So in this perspective, what the studio got wrong is that they didn't give support early access to the game and they didn't give the team that chance to be able to go through your game or go through your product with that sort of thinking cap
08:22
and they didn't give the team that time to think critically about what it is that they're supporting. I feel like had we been given that chance, someone would have said, hey, you know what? This flow is a little confusing. Maybe we should write some content around it. So eventually we did end up writing some content around it and that helped. I actually ended up making a video
08:41
on how to redeem the DLC, which I thought was silly at the time but it actually really helped the people on the front lines of support have something that they could point the customer to so they could perform some self-service. Having said that, what the studio got right after this is that after they saw the amount of volume that that question generated to our support center,
09:02
they were able to say, you know what? We messed up and moving forward, we're going to prioritize a relationship with support so that we can be more preemptive on future launches. So to give them that credit, they did realize this and then after that, they did prioritize a relationship with support.
09:20
And obviously, you know, this has been a while. I no longer work there. So I don't really know how things have gone. I don't think that they've ever had any issues. So let's move on to part two and let's talk about continuity. This is where my fun anecdotes
09:41
about working in the gaming industry stop and I start going into more about working in the tech industry and doing support specifically for a public API. So continuity, I see two types of paradigms of support and one works I think a little bit better than the other
10:01
so let's take a look at both. This assumes that if you are doing support, you have some sort of support queue or ticketing system. What I see typically is a pattern of either dedicated or rotating support.
10:20
What do I mean by that? Dedicated support is when you, depending on the size of your team, you have one person or a group of people whose sole responsibility is to answer support questions. You can also have a rotating support pattern where one person in your team rotates
10:41
whether it's weekly or biweekly and gets to put on that support hat. This is typically what I see in smaller teams where you might not necessarily have the volume to justify one person full-time always doing support. So a lot of the times, teams will have some sort of support rotation where someone trades in their responsibility
11:03
of answering support questions. Typically, I hear terms like support star or support firefighter for these types of support patterns. Let's talk about dedicated support. You can probably tell that this is a pattern that I like because I think that this works a lot better.
11:23
Dedicated support works because it allows you to have business continuity for your tickets. This means that it's someone's responsibility to answer those tickets and that means that it's that person's responsibility to ultimately let the customer know about a resolution. So they take a ticket from initial contact
11:41
through that solution and then they close that ticket and that means that you have continuity. That means that there's one person that knows what the final resolution to a particular issue or bug was and that is really important because that is what builds your support history
12:00
and then you can draw from that and you can find efficiencies based on that knowledge. Also, when you have someone whose sole role is to do support day in and day out, especially if they're in a more of a support engineering type of role, when they start doing one test repeatedly, they'll find efficiencies and that might mean
12:20
that they get to write a script or a tool to automate that process and a lot of the times, you don't get to those tools if you don't have someone whose job is to do those repetitive tasks over and over again until they figure out, hey, there's an efficiency here that we could learn from. This also helps you build relationships with your customers.
12:41
For me, that's specifically important in my team because I support a product that is an API. So typically, the relationship that we have with our customer is really long. If they're really gonna integrate with our API, typically, we're talking about years of support. So establishing that relationship with the customer,
13:01
with the developer that's integrating our API is really critical because we wanna make sure that they feel comfortable asking questions. I often get things like, oh, well, I don't know if this is a bug and I don't know if I should tell you about it. No, absolutely, tell us about it. We wanna hear about it so that we can fix it and you don't get there unless you have that sort of relationship.
13:20
Now, when dedicated support stops working, it's when you don't have a way to sort of promote the people in your support engineering team out of support engineering. A lot of the time, support engineering is a role that is typically filled with junior developers and that was actually my case.
13:42
And I think it's a pretty great way to level up junior devs, but you have to have a plan to get them out of support once they grow out of that role and that is an issue that I see a lot of the times with support engineering positions. You just kinda get stuck in support and you don't really see a way to move out
14:02
of that position unless it's at another company. And from this perspective, you lose out of all of their relationship building or all of the knowledge that that person had while they were doing that job for you. So I think that having that exact strategy and a clear path to move forward from support engineering
14:21
for your team is really important so that your team members can avoid burnout. Now, let's talk about the other side of the coin and that's rotating support. Well, I think it doesn't work. So when you have a support rotation,
14:42
usually this means that you have a team and maybe your support for the week and you get to meet the front lines, you get to ask questions or answer questions from your clients. And then once that week is over, you go and you move on with your life and you forget about the support that you did that week.
15:00
And what ends up happening a lot of the times if you don't have a very clear process for this is that there's not a lot of follow through. So sometimes bug fixes or solutions or tickets sort of get lost in the shuffle because once someone is done with their rotation, they just kind of forget about their tickets. A lot of the times this can lead to handoff confusion.
15:22
A lot of the times teams that have a sort of rotation for support don't have clear processes for handing off the ticket. And this also leads you to a point where there's not a lot of business continuity for a specific ticket. So if you're handing it off, if it's sort of like a long-term fix
15:41
and you keep on handing off that ticket to the next person to do support the next week, you lose that history. So that can also be challenging. Also, you lose some efficiencies because each time depending on how long the period is between rotations, you might forget how to do something.
16:01
So there's always a period where someone gets back into the rotation and again, it's sort of like their week to be in support. They have to learn how to do support again and it takes a little bit of ramp up. So there can be some efficiencies lost from that pattern. I do feel like there is a way that you can have
16:22
this sort of support rotation pattern work and that is if you have very clear guidelines specifically around issues that are gonna take longer to solve than your rotation duration. So for example, if you get a ticket and you verify it's a bug and you feel like this is gonna be
16:40
a non-trivial issue to solve and it's definitely gonna take you longer than a week to solve, you need to create a support issue whether it's in Jira or whatever other ticketing system you use to track features or to track your sprint and you need to add them to the active sprint. What I typically see when people use this pattern
17:00
of rotation for support is that they might even get to the let's create a ticket and put it in our backlog but if they don't actually put it in the active sprint, that ticket might not get resolved until weeks, maybe even months later and by that point, if you have that ticket open
17:20
with your end user, whoever responded first is getting graded on that response time so by not adding that ticket to your active sprint, you can potentially increase that time for a resolution and that can really hurt the support engineer that's providing the support to the end user
17:43
because then that means that they had a ticket open for weeks or maybe even months and you really want to try to avoid that so that's why I think it's really important that if you do get to a point where you see an issue and it's gonna take a long time to solve, go ahead and add it to your active sprint, otherwise you might forget it.
18:02
Also, and this is really big, if it happened on your rotation, you still own it. What I see most of the times with this types of support rotations is that again, I did my week of support, I didn't get this fixed but I'm not in support this week
18:21
so here you go and a lot of the times, there's a lot of history that's missed whenever there is a handoff like that between people that are doing on call and I feel like the best way to avoid any sort of scenario where an issue is lost track of is to make sure that you still own that issue
18:43
after your on call is over. This increases that ownership of that person to try to get that done, otherwise it might just fall through the cracks. We're gonna talk a little bit more about ownership in a second. Right now, we're gonna talk about ownership and escalations.
19:00
There's a couple of ways that I also see people handle ticket ownership and escalations. I call it removed versus owned and these are totally terms that I came up with. They're not established anywhere so if you find them a little confusing, let me know and I'm happy to clarify later. Essentially what I mean by this is that a removed escalation
19:20
is when you work as part of a team and you have to throw an issue over to the other side of the fence and then someone else fixes it and then they let you know, hey, it's fixed, get back to the end user and let them know that it's fixed. So there's sort of like a layer of removal between the person doing the fix and the end user.
19:45
I feel like this pattern can be really challenging because a lot of the times, teams lose a sense of the priority for the issue when you're throwing things over the fence.
20:00
This also means that when you're throwing an issue over the fence, if the other person on the other end doesn't actually have access to the end user, it loses a little bit of that impact. When you have a team member that is handling a support escalation
20:21
and they don't really get to talk to the client, I feel like that loss of accountability to the end user adds time to the resolution because typically when you leave people to their own devices and you say, hey, here's an issue, put it over the fence, fix it,
20:41
people tend to prioritize supports a lot lower than feature work and again, this kind of relates or goes back to the idea that typically as an industry, we don't tend to think positively about support. We're not really excited about support as to something we have to do so that's why when you use this pattern
21:02
of a removed escalation and you throw something over the fence and the person doing the fix doesn't actually know who it's impacting, they're just not really gonna prioritize the work. They're also not being graded on an SLA like the person that's actually responsible for getting back to the end user with a resolution.
21:22
So when you throw things over the fence and it's someone else's job to actually fix the bug, that person is not actually being graded on the time that it took to resolve and the resolution back to the end user, that person is being graded on whether or not
21:41
the bug got fixed but the person actually getting back to the end user on the support side, that person gets graded on how long the whole interaction took. So if it takes this other person a lot longer to fix and they're not really prioritizing the fix, it hurts the other person on the support side of things.
22:03
So this kind of goes back to that idea of like if you don't see that impact, just people don't care as much and this is, again, it's not malicious or anything like that it's just that it can be really hard for you as a developer to prioritize support if you don't really see the impact of your work.
22:21
You're more likely to see impact for new features that you're building as opposed to maybe a specific bug or edge case that someone is experiencing. So let's talk about owned escalations and this is the type of escalations that I really like to do
22:43
and the way that I see it, these are a little bit different when you throw something over the fence. In like an owned escalation pattern, when you send something over to fix for another team, it is that team's or that person's responsibility to ultimately get back to the client with a resolution.
23:03
So they are owning the solution and they are owning the response back to the customer and the reason why I think this works much better is because this allows you to have more of that business continuity if in my support ticketing system, I can actually say, hey, John, here's this ticket
23:23
I'm escalating to you. If you have any questions that are gonna help you solve that issue, go ahead and ask the end user directly. So this means that there's not so much of a middleman in between and then also the fact that that person gets to talk to the client and ask questions
23:41
brings the client back front and center so I feel like that helps sort of alleviate that issue of not seeing the impact of the thing that you're trying to solve. This also increases accountability because when you know specifically the person that you're talking to and how the issue is impacting them and their business,
24:00
I feel like you're much more likely to feel like, oh yeah, this is absolutely something that I really should fix and I should prioritize. So again, it increases accountability and also when it's your job to actually get back to the end user and let them know, hey, I went ahead and fixed this issue then you can also start seeing
24:20
any bottlenecks in the process because if it takes, once you escalate a ticket and if it takes that person or that team a little bit longer to get back to the client, then you can start seeing where some bottlenecks might be happening and then you can start addressing that. For our perspective, even if we don't have a specific fix, we like to at least get back to the end user
24:41
and let them know, give them a status so that they're informed, hey, this is gonna take a little bit longer than we thought to fix, we just wanted to give you the heads up that we're still working on it and if it takes, we check in a couple of days or every couple of days just to get people a little bit of an idea if something is getting, if something is gonna take a long time
25:01
so that also increases that accountability for someone because if you're having to tell your end user, we're still working on it, we're still working on it, it might sort of like help you or motivate you to get something done. The next thing that I wanna talk about is boundaries
25:23
and as I said earlier, this is something that we don't typically talk a lot about when we talk about support, if ever and I think that this is something really important because if you don't have clear boundaries with how you communicate with your clients, this can actually potentially lead
25:40
to some serious burnout from people so let's talk a little bit about boundaries with clients. One of the things that I see and this is typically something that I see for smaller teams where they can't really have 24 seven support so if you're working for a large corporation, this might not necessarily apply to you
26:00
if you need 24 seven support, typically people have support centers all over the world so that they can get or catch every time zone but for me, working in a mid-sized team, we really can't do that so in order to be able to get back to people within a reasonable amount of time,
26:22
we established support hours and we stick to them and we have an automated ticket if someone contacts us after hours to let them know, hey, we are based out of this time zone and we operate during these hours and if you send us an email outside of those business hours, we did get it
26:42
but just FYI, we'll reply within this time frame and that helps your team know that they don't have to be constantly checking for new tickets and they don't have to be getting back to people at non-business hours. Also, one thing that I've learned sort of the hard way
27:05
is that you need to allow for a reasonable amount of time between responses. I had an issue where I had a developer who was integrating our API and we typically don't do code review but it was a particularly slow day
27:22
so the developer sent me some code and he said, hey, I'm getting this error and I decided to look at it and it wasn't even an error with our API, it was a Ruby error so I was helping this person program some Ruby even though that's outside of the scope of what we can really do with support
27:42
at least in our perspective but I still helped him out and I let him know and I sent him on his way and then a few minutes later, he reopens the support ticket and he's like, oh, well, yeah, I'm not getting that error anymore but now I'm getting this other error and again, it was another sort of like
28:00
Ruby programming error, it was an error with his code, it wasn't an error with the API and I decided to help him again and then he reopened another support ticket and now I'm getting this error and again, I was like, I think he's using me as his own personal Stack Overflow and if I continue to do this,
28:22
I might write the entire application for him through support tickets and that can be a huge liability for you if you're in this sort of support type of business where you're helping other developers integrate with a product. That's a liability because if the other person,
28:41
the other end user developer doesn't actually know how the integration works, they're not gonna be able to troubleshoot any issues on their own and also very specifically with code and code samples, I often have to tell developers that specifically ask me, can you please write this bit of code for me?
29:01
I'm sorry but we can't write your code because if you don't understand how this works, then we're liable to you for that little piece of your code base and we simply cannot be liable to all of the developers for their own integrations. Again, it's setting that boundary of we work on this API
29:21
and we will help you integrate with this API and we will help you with any errors that you encounter while doing your integration but we very firmly cannot help you do the integration on your end or write your own code. I feel like that was a very healthy boundary for us to establish because again, it could potentially open us up for issues
29:42
and allowing a buffer time between responses solved my specific issue of people trying to treat me like I'm their personal stack overflow because when I allowed more time between responses, it allowed that person to go and try to figure things out on their own
30:00
and it actually unblocked them. So now even if I do have the time because it might be a particularly slow day, I try to time box the time when I actually answer questions so that I'm not constantly being someone else's crutch.
30:21
I think that crutch is a little bit of a harsh term but essentially when people get used to really good support, it does become a crutch. So it's a fine line to balance and I think that going back to a healthy time of response time between different interactions
30:40
can really help you not be a crutch for the people that you're supporting. Here's a recent interaction I had where I practiced this idea of a little bit more time between interactions and then he came back and he said, you know what, actually I read your docs,
31:01
I read the freaking manual and it's okay now. And this is the type of stuff that I love to see when people are able to say, I read your docs, it helped me, it unblocked me, yay. Let's talk about good behavior and bad behavior and reinforcing good behavior. This is an actual email that I got fairly recently from a developer.
31:23
I'm not gonna read everything out, try to redact it a little bit but essentially this guy is really, really angry and you can really tell from the tone and he's saying, sorry but this whole thing is messed up. I was trying to test your stupid web hooks
31:41
and they don't work. And this is a really big deal to me because our support tickets don't just go to the support team, they go to the entire engineering team and for our engineers to see this, it gets them really riled up, it can be really demoralizing
32:00
so from that perspective, I wanted to put an end to that type of behavior real quick so this was my reply. First of all, I tried to pull rank and say, hey, I'm the lead support engineer here at this team because sometimes that helps people take them out of whatever foul mood they were on. Sometimes people just wanna talk to someone in charge
32:22
so I said, hey, I'm the lead support person here and I just wanna let you know. I will get to your question which actually was user error but I said, I just want you to know that we have zero tolerance for that type of language and that type of behavior.
32:42
This goes out to the entire team and we will not hesitate to ban any developers that cannot stick to a modicum of the quorum when talking to us and our team. And after this, he never talked to us like this again. His tickets became way more reasonable. He's still working with us, he's still working on this API
33:02
but I needed to shut that behavior down because I knew that if I didn't address it really quickly and sternly, it would continue. This is where I can't stress enough that your team's sanity and happiness is so worth it and it comes first
33:20
because when I wrote that message, I didn't have to go and ask my product manager if it was okay for me to send that answer. I knew that they would have my back whenever I wrote them that reply. So if you are managing your support engineers in any way, shape or form,
33:41
you need to empower them and you need to let them know that their happiness ultimately really matters and if you have to step in and talk to someone in this way to let them know that their behavior is not gonna be tolerated, your support engineers need to know that you have their back
34:01
because really, if you don't have their back and you don't take their happiness into the equation, you're gonna have a bad time. If your engineering team is unhappy, it's just really gonna affect that morale but also, doing support engineering is so rough.
34:22
Again, you get people that talk to you like that fairly regularly so people can get burnt out doing this so really, you want to make sure that you have your teams back. And in this case, I'm happy to say that my team had my back and I felt empowered to tell that developer to quit that type of behavior.
34:42
Lastly, I wanna talk about tools for the job and specifically around documentation. Typically, we talk a lot about documentation, documenting our code, we talk about tests being documentation and it's all great but we rarely talk about documentation
35:01
as documenting things like troubleshooting or a thought process or how to do common tasks and when you're doing support engineering, this is so critical. So I came up with this idea for my team of the support playbook. So this is a living, breathing document
35:22
where we document processes of how to do common tasks and why this is really important is because when you onboard someone else to do support for you, you don't have to specifically pair with them 24-7 while they're just getting started. You can actually say,
35:41
hey, jump on the ticketing queue. If you have any questions, consult the support playbook first. If you can't find the answer, let us know and we'll pair on this with you and this actually allowed us to onboard people onto support a lot quicker because people felt empowered to go and find the answer
36:02
for a specific issue on their own before having to try to escalate the issue. So for these reasons, I think that support playbook is really vital for your support team. It helps reduce training time and then it also allows you to find efficiencies for automation.
36:22
If you find yourself and your team constantly looking at a specific page on your support playbook wiki on how to do a certain task, you know that that's something that could possibly be a good candidate for automation. I also think it's really important to empower your team with lots of data.
36:41
So from that perspective, I live and breathe by my dashboards. Specifically, we use Datadog and Grafana. I specifically like Grafana a lot because they have an open source version that you can self-host. And what this helps me do as a support engineer is when someone has a question,
37:00
I get a lot of panic developers saying, your API is down. It's not updated on your status page. And then I go and I have the power that this data gives me to be able to tell the developer, no, it's not down. But I do see this issue. Let's go and figure that out. So having that data can really empower your team. Logs, logs are my friend.
37:22
And I could not live without Scalyr. Scalyr is an amazing log aggregating tool that we use. We actually have three different accounts that we sort of switch back and forth. I am not paid by these people at all. I just wanted to give them a shout out because I could not do my job without this tool that helps me see all of the logs.
37:44
But also it helps set up alerting and that alerting can go to our Slack and that alerting can also open up incidents in our paging system. So this is vital to us. We also use RunScope for automated API testing.
38:00
Again, not being paid by them. Just a really neat product. And the way that we use this to monitor our product is that we have RunScope tests running about every five to 10 minutes and the results go into a Slack channel and we're able to track response times and whether something is succeeding or failing. And this actually allowed us to see
38:20
as we were working on new features and improving our API, it allowed us to see how our response times were actually improving. So I highly recommend this tool. RunScope also makes an open source tool called Request Bin that is really helpful whenever you're trying to troubleshoot endpoints or specifically web hooks.
38:41
They used to do a tool that they hosted that allowed you to create an ephemeral endpoint. Unfortunately, some people were abusing that. But Request Bin is open source and you can self-host an instance of Request Bin on Heroku and that's what I do a lot for testing things
39:01
and also helping other developers test our endpoints. So TLDR, talked a lot about support engineering and sort of best practices and what it means to me. And I feel like a support engineer by definition they're good at troubleshooting.
39:21
They're also good at communicating because they're having to talk to end users or maybe even between teams. They're also good at seeing patterns. Once you start doing something like support engineering every day, you'll start seeing patterns emerge that'll help you again gain efficiencies. So people that do this day in and day out can be really great at seeing patterns
39:42
within your own product. So I think that a good support engineer is a good engineer. And I really believe that support engineering can be a really great way to level up junior developers as long as you're giving them a path to move forward within your organization to other engineering opportunities within your team.
40:03
So having a good culture, having a culture that prioritizes support engineering simply means having a good engineering culture, period. If you have any questions about support engineering, I'm happy to talk to you after this,
40:21
maybe out in the hall. My name is Ceci Korea. You can find me on Twitter at Ceci Korea. I'm always happy to talk about support with you. Thank you.