atuin: magical shell history with Rust
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Title of Series | ||
Number of Parts | 542 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/61630 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
FOSDEM 2023283 / 542
2
5
10
14
15
16
22
24
27
29
31
36
43
48
56
63
74
78
83
87
89
95
96
99
104
106
107
117
119
121
122
125
126
128
130
132
134
135
136
141
143
146
148
152
155
157
159
161
165
166
168
170
173
176
180
181
185
191
194
196
197
198
199
206
207
209
210
211
212
216
219
220
227
228
229
231
232
233
236
250
252
256
258
260
263
264
267
271
273
275
276
278
282
286
292
293
298
299
300
302
312
316
321
322
324
339
341
342
343
344
351
352
354
355
356
357
359
369
370
372
373
376
378
379
380
382
383
387
390
394
395
401
405
406
410
411
413
415
416
421
426
430
437
438
440
441
443
444
445
446
448
449
450
451
458
464
468
472
475
476
479
481
493
494
498
499
502
509
513
516
517
520
522
524
525
531
534
535
537
538
541
00:00
SynchronizationGastropod shellSynchronizationProjective planeGastropod shellTurtle graphicsComputer animation
00:22
Gastropod shellSynchronizationContext awarenessAsynchronous Transfer ModeDigital filterStatisticsInterface (computing)Integrated development environmentGame theoryServer (computing)Image warpingLine (geometry)EncryptionBinary fileHacker (term)Source codeOpen sourceAsynchronous Transfer ModeQuicksortVirtual machineSynchronizationBinary codeServer (computing)StatisticsRight angleLibrary (computing)Projective planeSource codeMathematical analysisDivisorSemiconductor memoryDirectory serviceFuzzy logicCASE <Informatik>Gastropod shellDefault (computer science)System callPhysical systemContext awarenessLine (geometry)BitDifferent (Kate Ryan album)Turtle graphicsImage warpingNormal (geometry)SpacetimeInformation securityDatabasePatch (Unix)Multiplication signMiniDiscKeyboard shortcutHand fanFood energyRun time (program lifecycle phase)Functional (mathematics)Interpreter (computing)TrailAxiomSoftwareKey (cryptography)Entrainment (chronobiology)LaptopCuboidNeuroinformatikMultiplicationMedical imagingDependent and independent variablesOpen sourceSlide ruleWordRegulärer Ausdruck <Textverarbeitung>Shared memoryRevision controlCodeSoftware testingGraph (mathematics)Pattern languageLatent heatOcean currentService (economics)Computer animation
10:14
Open sourceQuicksortTangible user interfaceData storage deviceContext awarenessBlogSoftware repositoryType theoryTwitterRow (database)Software developerAsynchronous Transfer ModeMultiplication signNumberData managementMatrix (mathematics)Right angle2 (number)Plug-in (computing)AuthenticationLocal ringDefault (computer science)Normal (geometry)Server (computing)BitEncryptionStatisticsClient (computing)InformationDirectory serviceCellular automatonDifferent (Kate Ryan album)SequenceLibrary (computing)File formatNeuroinformatikTerm (mathematics)Similarity (geometry)AliasingSubsetLaptopComputer fileINTEGRALKeyboard shortcutSensitivity analysisSynchronizationKey (cryptography)Regular graphMechanism designExpected valueSoftware bugPlanningVirtual machineProcess (computing)Data miningReal-time operating systemQuicksortTimestampQuery languageLimit (category theory)LengthContext awarenessGastropod shellRepository (publishing)FIS Informationssysteme und Consulting GmbHProjective planeSinc functionInteractive televisionComputer animation
20:06
BlogComputer animationProgram flowchart
Transcript: English(auto-generated)
00:06
Hey everyone and welcome to my talk, keeping history in sync with turtles and magic or the same shell history everywhere. I'm going to be talking to you today about a project I've been building kind of on and off for the last two years or so. So to get started,
00:23
who am I? My name is Ellie and I'm the lead infrastructure engineer of a company called Posthog. When I'm not writing software for work, I try and maintain a couple of side projects in my free time. When I don't have the energy for that, I'm normally exploring outdoors, which as you can probably see is usually on a motorbike for me. So to dive
00:42
into a two-in, first of all I'm going to start with the name. Originally it was called shink for like shell and sink, but I couldn't really say that out loud without cringing. So I looked at something new. I've been a fan of Terry Pratchett's Discworld books for a really long time. And for those who are unfamiliar, the sort of premise there
01:03
is that the world is a disc and it rests on the shoulders of four giant elephants stood on the shell of a space turtle called the Great Atuan, which I'm probably mispronouncing. I thought it would be a bit pretentious to include the words the great in my project name and putting an apostrophe in a binary is probably not a good idea. So I ended
01:23
up with just the name Atuan. A little bit more specifically, Atuan was made to synchronize shell history between multiple computers. So I had the problem that I would be switching between a whole bunch of laptops. I'd be remoting into various different boxes
01:40
and trying to find one command that I ran a few days previously on whichever computer it was, was pretty difficult. So I wanted it all in the same place. The first thing I did was replace the normal ZSH history, bash history, or whatever fish uses, I don't really remember, with a SQLite database. And we could then have some functions to import
02:03
your normal text history into the database. And because databases are a little bit more flexible than flat text files, we could also include some additional context. So in the case of Atuan, this is context such as how long a command took to run, whether or not it was successful, which directory it was run in, as well as the shell session.
02:25
So the way we do this is we plug into your shell. If your shell supports it, it's via the normal shell hooks, like pre-command or pre-exec and post-command, I think they're called. But in the case of bash, which I do not have positive feelings towards,
02:41
we do a really horrible hack with the prompt. So hopefully you can see the GIF on the right. On top of this database, we also built a search TUI. This is bound by default to control R and the up arrow, which is a little bit contentious for some people, so
03:02
you can remap that too. The search UI has three different search modes by default. One of them is a fuzzy search, kind of inspired by FZF. The other is a prefix search, which is pretty self-explanatory, and a substring search, which same thing, you should know what that means. We also have several different filter modes. So Atuan allows you
03:26
to search your shell history for the current session, for the current directory, for the current machine, or just all of your shell history for every machine ever that you've connected anyway. It would be cool if it could have otherwise. A little bit more on that extra context. Atuan has a stats command, which analyzes
03:45
all of your history and will show you things like the most used command, which for me is LS. I didn't realize I ran that so much. How many commands you have ran, as well as how many unique commands you've ran. We're definitely not making the most of all the data available, and there's a lot more sort of cool analysis we
04:03
could do. And you can also get the stats for a specific day or week or month or whatever. A little bit more on the search. You don't have to use the search UI. We also have a command line search interface. This is kind of useful if you have like a specific
04:22
command in mind. Maybe you know roughly when it was or roughly what it looks like. And it's also useful to integrate with other tools. So someone on the Discord told me that apparently they've used this to integrate directly with FZF as their search instead, which is pretty cool. So you can see here that I'm searching all successfully
04:41
ran commands after yesterday at 3 p.m. that start with Git. Obviously, I did not make these slides today. The time specifier supports like a human way of expressing time, and the command search supports regular expressions. A little bit more about the sync server.
05:02
It's a kind of pretty boring HTTP API that shares blobs. It has no idea what the blobs actually contain. And it was originally written with warp, which I found to be very fun. Kind of nice mental exercise, I guess. And we ended up rewriting with axiom, because
05:23
while warp was fun, it was difficult for contributors to figure out how to use, and it also contributed pretty massively to a high compile time. And axiom has just served the problems there. The attune sync server is completely self-hostable. Anyone with it installed can just run attune server and have a running server.
05:44
We also have docker images and kubernetes manifests for anyone that wants to get a little bit more fancy. And a little bit more on the sync is that it's not quite real time yet. While I would love it if it was, it currently syncs an interval of 15 minutes. And you can reduce this down to zero, which basically means it will sync
06:02
after every single command. If you don't fancy running your own infrastructure, there's a public deployment of attune that I ran. Currently it's got about 11 million lines of shell history on it. There's about 300 active users. And it's all running on just like one dedicated test in a box. And it handles way more requests than I
06:24
thought it ever would. I'd also like to thank the GitHub sponsors I got, which I didn't really expect anyone to contribute. But they cover the server bills entirely now, which is a really nice feeling. And a little bit more about privacy. I imagine people here probably feel more strongly about that than
06:41
others. Everything's fully end-to-end encrypted in the sync because I really don't want the responsibility of people's accidentally pasted into a shell API keys on my machine. And we use libsodium secret box because I'm not at all a cryptographer, and it's more difficult to mess up than most other things.
07:03
Finding a reliably maintained library for that was a bit tricky. The original bindings we used were not maintained beyond security patches. We recently switched to, I think, Rust crypto, if I remember rightly. All of the encryption keys get automatically generated when you log in, and you have
07:21
to keep track of them yourself. So if you lose your keys, there's nothing I can do. Your data's gone. So why Rust? This is the Rust dev room, after all. It runs twice for every shell command you run. So it runs just before and just afterwards. It lets us get the timing data and everything else. And if we had latency there for an interpreter to start up or a
07:44
runtime to do whatever it does, the experience would not be great. If you added 50 to 100 milliseconds to every command you ran, people would pretty rightfully complain. So Rust fits the bill very nicely there. It also has to be reliable, because if we're dropping shell history
08:03
randomly, then it's not at all serving the purpose it was written for. Having a static binary to deploy is also really nice. No one has to make sure they have Rust 3.7, not pointing at any languages in particular, installed on their system with the right versions of various
08:20
libraries installed or anything like that. And it's also safe. So I don't have to worry about any memory issues or anything like that. The other factor which I think for a side project is especially important is that Rust is fun. When I started this project, I was also considering using Go. And I was also
08:40
writing Go for my day job. And I didn't really fancy the idea of getting home after work, writing Go all day, and then writing some Orgo. So Rust solved that very nicely. And I think the main reason I actually got around to finishing this is because I was enjoying writing it. Additionally, the Rust community is fantastic. Every time I've asked for help, people have been
09:00
really helpful. Everything I wanted has been available. And they're just generally very welcoming and accepting, especially compared to some other tech communities. So I actually have one other service. And I'm glad most of the previous talks have discussed Python, because now I don't feel as weird for mentioning it in my presentation too. I have another service
09:21
called Rinse Wind. Bit of a naming pattern there, if anyone is familiar with it. And what this basically does is it peeks into the database and generates graphs like this, which are heavily inspired by the GitHub commit activity chart, but for your shell history. And it's currently closed source for no real reason other than that it's a really horrible hack
09:42
that I don't want to package nicely for anyone. It mostly uses NumPy and OpenCV and a few other things. It's also completely opt-in. So you don't get this by default. If you don't want any proprietary code touching your data, you don't have to. It's cool. Just with one curl command, you enable this. On the open source side of things, this is
10:04
the first open source project I've released that people have actually been interested in. I made it just for myself and stuck it on my GitHub. And it ended up being quite well received by a whole bunch of people. We ended up in a lot of package repositories. I think off the top of my head,
10:21
it's the Arch Linux community repo, Homebrew, Alpine Linux, and some Nix. I'm not entirely sure how Nix works, but one of the Nix repositories. And there's probably a whole bunch more that I'm not aware of. And we've actually got 63 contributors as of today. Some of them are sort of returning regular contributors, which is very
10:42
nice that people want to regularly give time to my project. Some of them are just sort of drive-by. They found something that annoyed them or a bug they wanted to fix or something like that. So they contributed, which was lovely. I'd also like to especially thank Conrad. He's much more involved in the Rust community than I am and also a very long-term friend of mine. He helps me maintain
11:02
a 2N. And when I was first starting and not so good at Rust, he did a great job of tightening things up a bit. In terms of the future, right now, a 2N has a bit of a flaw in that you can't actually delete history once it's been synced. This is mostly because the sync's pretty eventually
11:21
consistent and every machine you have is a potential writer. So, ensuring that you delete something and it stays deleted is actually really difficult. I've currently got a solution to it, which works on my laptop. I just need to make sure it works on everyone else's too. I'd also like to sort out Bash because pretty much all the
11:40
complaints we get about shell integrations are from people running Bash and it's very frustrating, I think. I don't actually use Bash and I hate having a setup on my machine just for that. I'd also like to show some more information in the TUI. So, I don't know if you saw very much on the GIF earlier, but it basically just shows what's useful for search results. I would love
12:02
it if there was another tab where you could also see sort of statistics about a command that's run, maybe how often it succeeds versus fails. You could get some nice stats about make build that way and that sort of thing. I'd like to improve the search a little bit too because right now it's good enough. I think it could always be
12:21
improved. I've been meaning to explore some of the full text search modules that SQLite has or maybe something like TanTV or one of the other search libraries in Rust. Otherwise, I'd really like to improve the sorting. Right now we sort chronologically, which is a pretty safe default. I'm not going to turn this into a
12:41
horrible Twitter timeline type thing, but it would be nice if we could sort based on the context we have. So, maybe every day at 9 a.m. you CD into your repo and you run Git pull. By default, it would be nice if you press Ctrl-R and Git pull was already there at the time that you frequently run it. We've got all the data for that. It just needs to be plugged together.
13:01
In the even further future, the number of people that have spoken to me about the fact that they have development API keys in their shell history, it would be nice if we could do something to get that out of the shell history and sync that alongside the data. Being able to bookmark commands is also
13:20
something I would quite like to be able to do because there's some longer commands I run frequently and search for frequently. Having some sort of hotkey or alias would be really nice. Otherwise, I realized that a subset of a 2N's history could also be used as a runbook if you had a begin and an end marker to it and you could just replay some commands
13:42
from your past. That's actually it. I went a bit faster than I was expecting. But if there are any questions, I'd be very happy to answer them.
14:06
Can you search for things which have come after your most recent command frequently? I'm not sure what you mean, sorry. So, to take what you've just typed and see what you typically do next, so actually returning the command after the one you've searched for.
14:22
That's one of the things I'd love to be able to do with the smarter ordering is know a sequence of commands that's commonly run and predict the next one based on history, if that's, yeah.
14:42
So, I tried to install your tool, but I'm using Bash and I was wondering how far are you with fixing Bash? Bash generally works fine. It's usually the people that have a whole bunch of Bash plugins installed or have a weird Bash prompt that start to have some issues, but generally it's okay for most people.
15:04
Yeah, sorry. Does it handle having different cells in different computers? For example, if I'm using one computer FIS and another CS8, does the thing work
15:23
between those two? Yes, so we translate from whatever your shell uses natively into the format we use, so whichever shell you use on each machine doesn't matter. Ah, okay, thanks. I have a couple of questions.
15:40
First, I didn't quite get how do you authenticate with the server by having a key. So, the sort of user authentication is just a username and password, but then your actual data is encrypted by a key that's only held locally. All right, and second question. Do you have a homemade ZSH plugin or have you considered one?
16:02
So, we have a ZSH plugin. You can use normal ZSH plugin managers to install and use it. All right, thank you. Getting some exercise in.
16:24
Is it possible to disable the history for a few commands and then re-enable it? Not currently. We have spoken about the idea of an incognito mode. If you prefix a command with a space, it won't be saved. So, it's kind of annoying if you've got to run a lot of them in a row.
16:41
We have some questions from The Matrix. So, Olivier Robert says, how would it interact with something like Starship? I actually use Starship, and it doesn't interact with it at all in that it works completely fine. And, yep, that was the only question.
17:04
Cool, thank you. There's one at the front too. Actually, two short questions. The first one is, since Besh is so probably, I'm using Besh. What's your favorite shell?
17:20
I like ZSH, I think, purely because I started using it maybe 10 years ago and have it so hard to break. I think if I was gonna start again, I'd probably try Phish a bit more. And a question about the timestamps. Are you using the client-side timestamps from the machines or server-side? So, we actually store client-side, the timestamp will be whatever your client is,
17:42
but we actually use two timestamps for sync to work. So, we have the server local timestamp, which is only really used for syncing, and then the actual data, it's all encrypted and hidden, so it's whatever your client stores. Yeah, because sometimes the local timestamp is important if you want to sync with the system, whatever, but sometimes also the real time. Yeah.
18:01
Computers are out of sync, which that should happen. I had a bunch of issues with timestamps when I was first writing it, but we got it all sorted out in the end. Is there a limit to the length of a command? For example, imagine a huge pipeline with the SQLs and JQ queries in there. Currently, it's eight megabytes of whatever it is
18:23
once it's been encrypted. It's only a server-side limit, and it's pretty arbitrary. And another question. Any plans for special handling for similar commands? Will you fix syntax, run similar commands in a row? I hadn't really thought of that before,
18:41
but it might be worth considering. Sorry, I did have a few more questions from Matrix. I think my device is not synchronizing properly, but Andy sent me a screenshot. So does it integrate with regular history mechanisms provided by the shell? For example, excluding certain commands automatically,
19:01
like CDNLS, skipping storing in history by prefixing with whitespace for sensitive commands, et cetera. So the prefixing with whitespace is included. The default ignoring is not, but it doesn't actually replace the text file history either. You will still write to that if you ever decide, do you want to stop using it?
19:21
Okay, and where would context-aware recommendations come from? So if we have a history of your shell, we know the directories you're in, we know what commands you've been running at what times. So if we're predicting the next command that you want to run, we could use your own history. But the question follows up with, it's end-to-end encrypted.
19:40
Oh, it would all be from the client. So there's nothing. The server's just a dumb blob store. It doesn't really know much of anything. Any more questions? I think that's it. Awesome. Thank you.
20:03
Thank you. That was really well.