The Things Git Can Do (that none of the GUIs have ever told you about)
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Serientitel | ||
Anzahl der Teile | 96 | |
Autor | ||
Lizenz | CC-Namensnennung - keine kommerzielle Nutzung - Weitergabe unter gleichen Bedingungen 3.0 Unported: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben | |
Identifikatoren | 10.5446/51850 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
|
00:00
Rechter WinkelBitEDV-BeratungGeradeGraphische BenutzeroberflächeComputeranimation
01:17
ZahlenbereichWort <Informatik>ProgrammiergerätFächer <Mathematik>Prozess <Informatik>MultiplikationsoperatorProgrammierungVerknüpfungsgliedCASE <Informatik>Güte der AnpassungJSONXML
02:21
BeobachtungsstudieVerzweigendes ProgrammProgrammierungWurm <Informatik>TeilmengeEndliche ModelltheorieHypermediaSchreiben <Datenverarbeitung>NeuroinformatikMenütechnikMultiplikationsoperatorNichtlinearer OperatorNP-hartes ProblemTorvalds, LinusGraphische BenutzeroberflächeSchlussregelHilfesystemElektronische PublikationCodeMinkowski-MetrikVersionsverwaltungVerzweigendes ProgrammPhysikalisches SystemGüte der AnpassungVirtuelle MaschineTransformation <Mathematik>BildschirmfensterParametersystemElektronischer ProgrammführerCoxeter-GruppeSichtenkonzeptBildschirmsymbolWort <Informatik>RegelkreisRechter WinkelVerband <Mathematik>VerknüpfungsgliedStützpunkt <Mathematik>Physikalischer EffektBildschirmmaskeBenutzerfreundlichkeitMetropolitan area networkWeb SiteVorzeichen <Mathematik>ProgrammierungOrdnung <Mathematik>DatenflussProjektive EbeneDifferenteComputeranimation
10:15
InformationExistenzsatzArithmetisches MittelElektronische PublikationProjektive EbeneProgrammbibliothekVerzeichnisdienstGeradeOrdnung <Mathematik>MathematikProgrammfehlerMessage-PassingAggregatzustandZeitstempelSoftwaretestBildschirmfensterSicherungskopieDateiverwaltungVersionsverwaltungMereologieQuick-SortInformationPhysikalisches SystemCASE <Informatik>Web logRechter WinkelPortscannerZweiRechenschieberSystemaufrufPunktProgramm/QuellcodeBesprechung/InterviewComputeranimation
15:03
SichtenkonzeptFormale GrammatikMultiplikationElektronische PublikationGeradeRechter WinkelVerzweigendes ProgrammComputeranimation
16:10
SichtenkonzeptHilfesystemSchreib-Lese-KopfVersionsverwaltungVariableHill-DifferentialgleichungWeb logMechatronikHIP <Kommunikationsprotokoll>Wurm <Informatik>HydrostatikKlasse <Mathematik>SoftwaretestSpielkonsoleOvalIndexberechnungGeradeBitInhalt <Mathematik>Computerunterstützte ÜbersetzungHinterlegungsverfahren <Kryptologie>Objekt <Kategorie>Hash-AlgorithmusDokumentenserverZeichenketteMAPCASE <Informatik>DatenstrukturFunktion <Mathematik>Elektronische PublikationMereologieVerzweigendes ProgrammPlug inVerzeichnisdienstMultiplikationsoperatorTouchscreenSelbstrepräsentationTopologieHeegaard-ZerlegungWurzel <Mathematik>VererbungshierarchieProgrammierungFestplatteVersionsverwaltungDateiverwaltungMailing-ListeMapping <Computergraphik>MetadatenInformationKette <Mathematik>Projektive EbeneDifferenteImplementierungInformationsspeicherungDifferenz <Mathematik>Divergente ReiheBeweistheorieRechter WinkelAutomatische IndexierungPhysikalisches SystemVollständiger VerbandGanze FunktionTorusVerknüpfungsgliedSchreib-Lese-KopfRohdatenOntologie <Wissensverarbeitung>Leistung <Physik>BildschirmfensterPlotterEndliche ModelltheorieMessage-PassingNabel <Mathematik>AutorisierungVertauschungsrelationQuellcodePhysikalischer EffektAssoziativgesetzGüte der AnpassungNotepad-ComputerWeb logSuite <Programmpaket>Router
23:45
BimodulGraphGüte der AnpassungProgrammierspracheElektronische PublikationInhalt <Mathematik>Problemorientierte ProgrammierspracheMultiplikationsoperatorVerzeichnisdienstZeitreiseGeradeNP-hartes ProblemMAPTopologieEndliche ModelltheorieProgrammiergerätDifferenteInformationDokumentenserverVerzweigendes ProgrammEinschließungssatzHinterlegungsverfahren <Kryptologie>MetadatenMessage-PassingOrdnung <Mathematik>VertauschungsrelationZeichenketteZeitstempelRichtungComputeranimation
26:49
Schreib-Lese-KopfAliasingHydrostatikOvalPhysikalisches SystemSoftwaretestSpielkonsoleProjektive EbenePunktMessage-PassingMathematikMailing-ListeMathematische LogikMultiplikationsoperatorGraphfärbungFunktion <Mathematik>AssoziativgesetzHochdruckPatch <Software>Automatische IndexierungLoginSchreib-Lese-KopfZeichenketteStandardabweichungDateiformatMAPElektronische PublikationInformationProgrammierungPhysikalisches SystemSpannweite <Stochastik>KonfigurationsraumGarbentheorieVersionsverwaltungGanze FunktionProgrammiergerätSchaltnetzVerzweigendes ProgrammHinterlegungsverfahren <Kryptologie>Ordnung <Mathematik>MereologieEinsZahlenbereichGeradeFlächeninhaltDefaultTypentheorieTexteditorDokumentenserverWort <Informatik>Hash-AlgorithmusDifferenz <Mathematik>SchlussregelUmwandlungsenthalpieFreier ParameterGrundraumVerzeichnisdienstInformationsspeicherungAliasingVertauschungsrelationRuhmasseVerknüpfungsgliedRechter WinkelBestimmtheitsmaßGesetz <Physik>System FFunktionalMini-DiscMulti-Tier-ArchitekturKette <Mathematik>LochkarteAbsoluter RaumZeitstempelDatenfeldArithmetisches MittelBildschirmmaskeGüte der AnpassungMetropolitan area network
35:35
VerzeichnisdienstIndexberechnungLokales MinimumPhysikalisches SystemProgrammSpielkonsoleOvalAliasingElektronische PublikationReelle ZahlKette <Mathematik>GeradeSelbstrepräsentationVererbungshierarchieElektronische PublikationSystem FNeuroinformatikDemo <Programm>MultiplikationsoperatorMereologieHinterlegungsverfahren <Kryptologie>Differenz <Mathematik>MagnetbandlaufwerkAliasingHeegaard-ZerlegungMessage-PassingMAPProgrammiergerätSprachsyntheseInformationBefehl <Informatik>SkalarproduktSystemaufrufPunktCASE <Informatik>Verzweigendes ProgrammOrdnung <Mathematik>Schreib-Lese-KopfRechter WinkelSoftwareentwicklerMusterspracheInhalt <Mathematik>Automatische IndexierungZahlenbereichRuhmasseBenutzerfreundlichkeitVertauschungsrelationVierzigVerzeichnisdienstCoxeter-GruppeAggregatzustandZahlensystemSelbst organisierendes SystemVerknüpfungsgliedAnalysisPhysikalischer Effekt
44:22
Strategisches SpielIndexberechnungWeb-SeiteMechatronikHackerElektronische PublikationVerzweigendes ProgrammGraphVertauschungsrelationGeradeFlächeninhaltCASE <Informatik>MereologiePhysikalisches SystemWort <Informatik>MultiplikationsoperatorKette <Mathematik>ZahlenbereichSoundverarbeitungSchlussregelTorusEndliche ModelltheorieNeuroinformatikSondierungSkriptspracheStellenringVerkehrsinformationRechter WinkelProjektive EbeneAlgorithmusHinterlegungsverfahren <Kryptologie>BitVererbungshierarchieVersionsverwaltungThumbnailMessage-PassingDatenmissbrauchPunktFamilie <Mathematik>Quick-SortMathematikBildgebendes VerfahrenZweiMehrrechnersystemSpeicherbereinigungBrowserDokumentenserverGefangenendilemma
53:09
Lipschitz-StetigkeitVersionsverwaltungOvalIndexberechnungSoftwaretestSpielkonsoleProgrammPhysikalisches SystemRechenwerkElektronische PublikationGeradeVerschlingungSpannweite <Stochastik>SkriptspracheFehlererkennungPlastikkarteCASE <Informatik>Mailing-ListeElektronische PublikationCodeSoftwaretestMereologieGebäude <Mathematik>ZweiPatch <Software>FehlermeldungAutomatische HandlungsplanungWort <Informatik>GruppenoperationAutorisierungWurzel <Mathematik>Hinterlegungsverfahren <Kryptologie>Güte der AnpassungMultiplikationsoperatorGeradeMehrrechnersystemPunktAutomatische IndexierungOrdnung <Mathematik>DokumentenserverVerzeichnisdienstSchreib-Lese-KopfZeichenketteMessage-PassingRechter WinkelEinfügungsdämpfungRoutingMathematikCoxeter-GruppeRuhmasseInteraktives FernsehenNeuroinformatikZahlenbereichBeobachtungsstudie
01:01:56
TopologieProzess <Informatik>ZahlenbereichRuhmasseMailing-ListePay-TVWort <Informatik>PunktStrömungsrichtungDatenflussBenutzerfreundlichkeitEinsGüte der AnpassungDemoszene <Programmierung>MultiplikationsoperatorComputeranimation
Transkript: Englisch(automatisch erzeugt)
00:04
Alright, let's get started. Hello everyone. How are you doing? Tired after a week of sessions and stuff? Well, we are going to smooth into a light topic, Git.
00:22
And as I always say, if we are going to do some command line stuff with Git, it's always better to do it on a Friday, right before lunch. So we are right on. Well, my name is Eriko Campidoglio. Welcome to this session. This session is about Git, so I'm kind of counting on you guys using Git.
00:45
How many of you are using Git today? Okay, so alright, pretty much all of you, great. But the question is how many of you are using it from the command line? Alright, okay, about half I would say. And I guess the other half is using it through a GUI
01:04
Is there a third way to use it that I don't know of, like voice control? No, alright, not yet. So let me tell you a little bit about myself. I work at a relatively small consulting firm called Treton 37.
01:21
I don't know how many of you are from Scandinavia, but Treton is the Swedish word for the number 13. So it's like 1337. And if any of you were around IRC channels in the early, well mid-90s, you will know that that's how we spell the word lit. 1-3-3-3-7. Anyway, geeky stuff.
01:41
What do I do there? I work as a programmer. Now, is there anyone here who has seen me show this picture before? No, great. In that case, I can tell you this little story. When I show this picture, sometimes people laugh, sometimes people don't laugh. And when I laugh, I feel kind of bad because, you know,
02:00
this is actually me on my first programming job in 1989. And we worked out of a cottage, apparently, but I don't smoke anymore. But we're not here to talk about the good old times, nor IRC channels. We are here to talk about Git. So let me first tell you how I got started using Git
02:21
because it's kind of an interesting story. So I actually, my first distributed source control system wasn't Git. It was something else called Mercurial. Now, has anyone here used Mercurial before? Okay, so actually a fair amount of you. So Mercurial is a distributed source control system and it was, well, I would say one of the strong arguments
02:44
was that it was Windows friendly, whatever that means. However, I wasn't really satisfied with Mercurial because Mercurial actually behaved a lot like other source control systems that I knew from before.
03:01
For example, like Subversion, CVS and other good stuff. In the sense that when something was done, there was no way to undo it, even if it only existed on my local machine being a distributed source control system. So it didn't take me long to find out about another
03:20
distributed source control system, and this was about six or seven years ago, and it was called Git. Now, as soon as I started using Git, I never looked back to Mercurial ever again. But why did I like Git so much? Well, Git did at least two things differently than Mercurial
03:42
and all other source control systems before it. One of the things is the way it views history. Now, we all know that the history of our codebases is very important, right? Do we agree on the premise? History is very important.
04:01
Even the Chinese philosophers of the past knew that, that if you are going to move forward in a codebase, you need to first find out how it got here in the first place. So the history of our codebase is a journal that documents the transformations that your code has gone through
04:21
through time and why. So we really need to take care of it. However, source control systems have been kind of unforgiving in the past. Once something was committed, it was done, right? There was no way to go back and change it. And, you know, everyone makes mistakes, especially programmers.
04:41
So we need a way to go back and take care of our history regardless of how we prefer to work our own workflow. Because as it turns out, history is not the same thing as your workflow. You should be able to work the way you like
05:02
and still take care of making sure that your history is clean and readable and makes sense to everyone else on the team and your future team members and even your future self. Because while history is public, your workflow is private. And I'm sorry for that icon,
05:20
I couldn't find a better way to represent private. If you guys come up with a better way, let me know afterwards. So that's something that Git does differently. As Linus Torvalds eloquently put it, don't expose your crap.
05:41
So Git allows you to work the way you like and still present it to the world in a way that's clear. The second thing that Git does differently is the way it handles branches. Now we all know from the past, we all have the scars of branching in other source control systems
06:02
like CVS, Subversion, TFS and those like that. Because every time you created a branch, you know that you created an entire copy of all your files down to the last one. So if you had a large code base, a large project,
06:23
you know, every branch was an entire copy of it. You could potentially, you know, you could potentially run out of space. And also the merging and branching operations were slow because creating new branch, you know, meant copying all the files. So we all know that the rule of branching is
06:41
never branch until you absolutely have to. Right? That's the rule we live by. However, in Git, branches are cheap. It's a cheap operation. And if something is cheap, we can use it in a very useful way.
07:03
So, remember the rule about never branching until you absolutely have to? The new rule in Git is branch like there is no tomorrow. Branch every time you like. Use branches in a way that makes sense to your own workflow.
07:21
So use them to your own advantage because they're a cheap operation. So history and branching are two of the things that Git does differently than all other source control systems. All right, but there is a problem. It's not all sunshine and rainbows.
07:40
I don't know if you noticed, but Git is kind of hard. How many of you agree that Git is hard? Okay, this time I really nailed it because I wrote almost everyone. Sometimes all the hands go up and sometimes none of the hands go up. So nobody thinks it's hard.
08:01
Well, I think it's hard, yeah, but not in the way everyone thinks. And I'm going to explain that. I don't know if you noticed, but people like to complain on social media. And when something is hard, people go on and write stuff like this.
08:26
Git to me is harder than programming. Now, this is kind of a bold statement, but I kind of see where it's coming from. And in computer science, what do you do when something is hard? Any guesses?
08:41
In computers, if something is hard, what do you do? Yeah, that's right. You kind of stick a GUI on top of it to make it easy to use. All right, and look at those GUIs. I don't know if any of you are using it. I use one of those GUIs, and if yes, I'm sorry, but look at that menu
09:05
and look at this right-click menu. Now, tell me how this is less hard or easier to use than using it, for example, from the command line where it's meant to be used. So these GUIs don't actually solve the problem.
09:23
Maybe they solve the problem in other situations, but for Git, they don't solve the problem. So what we want to do is expose a subset of what Git can do and just present it to you without guidelines. So if you don't know what to press, there is actually no help.
09:40
So what we want to do instead, we want to understand how Git works internally from the ground up and build this mental model of how Git works, and then with that knowledge, we can use Git from the command line where we have all the freedom we need. There is no GUI that, you know, presents a subset of what's possible
10:01
and having this mental model as a guide. And this is what this talk is about. So no more sadness from the GUIs. Instead, we want to understand Git. So if you think that Git is hard, what if I told you that it's actually not hard?
10:22
It's simple. Let me show you what I mean by that. Now, let's imagine a world where Git doesn't exist. And all we have are files and directories. Now, this is your usual file explorer.
10:44
And let's pretend we all work on a project together. Do you have an idea for a name? Let's make a library. Why not? Any ideas of names for this library that we're going to work on together? New library one. New library one. That's a good name. That's a good name.
11:01
What about something like the Windows Test Testing Foundation? And of course, we're going to need an acronym for that. So let's call this the WTF library. Okay, so this is our project.
11:22
And we don't have any source control. We just have, you know, the file system. So we work on this file for a while. And now we want to save a copy of it as it is right now. So what do we do? Any guess? You create a directory. You give it a name that would be like your message of what the state of the file is.
11:43
Let's say just A for convenience. Then you take this file, you copy and you paste it in there. And here you have the timestamp of when this was done. So you have kind of a reference. And then you keep working on this file. And we want to save it again. So let's make a new directory.
12:01
Let's take it and paste it in there. All right. Now we feel like this library is kind of done, ready to be released. So we make a new. We take a new snapshot of it, a new backup copy or whatever you want to call it. And we paste it in there as well. All right. Now we keep working on it.
12:21
This won't go on for too long, don't worry. All right. So now the last snapshot, D, contains stuff that's after the release because C is the one that's been released. And now the unthinkable happens. Someone writes an issue or reports a bug.
12:43
And now we need to fix it. However, we don't want to include the changes that are here in this copy, which is the one we are still working on. We need to go back and retrieve the copy of the library as it was released. Right? So what do we do? Luckily for us, we have it over here.
13:01
So we can copy it and paste it in the one we are working on and replace it. And now we have the file as it was when it was released. We fix the bug and we take a new copy. All right. However, now we want to resume work.
13:22
But what file should we take? Now imagine that we have a lot more of these. How do we know which snapshot we were working on before? And how do we know how we got there? Just copying it in directories like this doesn't really help.
13:41
So what is the least we can do now to create some sort of history in what order things happened? Well, the least we can do is what if we record the snapshot that came before it as the only piece of information that we add?
14:02
So in this case, A would be the first snapshot, so we don't write anything. And in B, we say that A came before B. And we make it part of the message. And then we say that B came before C. Right? At this point, we know that D also came after C.
14:25
And E, which was the bug fix, also came after C. So now, I think you guys can see that we have a kind of a history going on, right? We have a line here.
14:42
And this line suddenly diverges here, because we have two snapshots that both have C as its previous one. We are kind of seeing how history is diverging now, just by recording what came before it. Okay? You see where this is going.
15:03
However, we have another problem. Now that we have multiple lines of history, we need a way to refer to them. Because it would be easier for us to know. What is the simplest thing we can do now to give a name to those lines?
15:24
Any guesses? All right, so let's create a text file. Let's call these branches. All right? And in this file, we are going to record the last one in a certain line, because there is always going to be a last one, right?
15:42
So let's associate a name with the last snapshot in a line of history. So let's say, for example, that E is the V next line of history, and D is our main line of history.
16:01
What do you want to call it? Yeah, let's call it master. Could have been something else. Now we have one last problem to solve. Now we know we have lines of history, we have names associated with them, so we know what they mean, we know what the latest snapshot in a line of history is,
16:23
and we know how from there to go back and reconstruct the chain of parents. But this file here, the one we are working on, our working copy, how do we know from which snapshot it's coming from? We don't know. It could be coming from either one of those.
16:44
Here we will always be working on the latest snapshot of a line of history. So what about we create a new text file, and let's call it head, and in this file we are just going to record in which line of history are we currently working on. So this file is coming from the snapshot in that line of history.
17:04
So, for example, let's say master. Now, what I've done here is exactly what Git does every time you run commands. So, as you see, Git is nothing more than a file system built on top of your own file system,
17:26
which is not so strange considering that Linus Tovas is a kind of a file system guy, and is very interested in high performance. So Git actually was born as a bunch of C programs that manage this directory structure for you and these files.
17:44
And the source control part has been built on top of it afterwards. Now, I see you guys don't believe me. You say this is impossible. Alright, so I'm going to show it, I'm going to prove it that this is actually the case. So let's forget about this.
18:01
Now, what I have here is, I have a split screen where on the top I have a console, which in this case is PowerShell. And how many of you are using PowerShell, by the way, with Git? Few of you? Alright. So I actually prefer PowerShell to Git Bash on Windows,
18:23
because I can use this plugin called Posh Git. It's actually a module. It's called Posh Git over here, and it's on GitHub. And if you install that, you just copy the file basically where the PowerShell modules are,
18:40
it will give you this nice prompt over here, which is very useful when you work with Git. For example, the current branch name and the way every time we change the files is going to show how many files are modified in our working directory and how many of those are part of the index or staging area, as you're going to see. So I was going to prove to you that what I showed you before is actually what is done internally.
19:04
So there are some Git commands that you normally wouldn't use. Those are called plumbing commands, and those are the lower level infrastructure of commands that manage that structure for us. So you normally wouldn't use them, but in this case I'm going to use them to show you how this structure looks like.
19:23
So one of them is called show ref. Show ref shows you all the named references, the one we had in our branches file, in the output. And you see that we have one called master, which is in this directory. Now master is only a file that lives in the .git directory,
19:44
which is a directory that is in every repository. And you can see that I can show you that if I say git refs head master, you see that's a file that contains only that long string,
20:02
and that string is the SHA-1 hash of the commit. Because in git, the way git references every object is by making a hash of its contents. So we have the master reference.
20:23
Now let's look, let's take a bit of this ID, and let's look at its contents by using another command called git cat file. Now what we are seeing here is the metadata associated to a commit.
20:44
Alright? And we see that there is some information associated to it. There is the author, the committer, which could be different in some projects. Then, as I showed you before, there is the parent, and it's another ID,
21:05
so a commit knows what the parent is by containing the ID in its metadata, and then it has a reference to something called a tree. Now, what's a tree? A tree is git's internal representation of a directory.
21:23
So we had directories before, git represents them internally with an object called a tree, which basically maps to a path. So let's grab a little bit of this ID, and let's use another command called list tree,
21:41
and we will see that this tree is basically the root directory in our repository, and it contains our files, this one you recognize, and these files are called blobs. Blob is another git representation for a file, so trees are directories and blobs are files in git's internal structure.
22:05
Now let's take a look at this file. Let's grab a bit of this ID and then use cat file again, and bam, you have the entire content of the file. It's not a diff from the previous one, it's the entire file.
22:24
So as I told you before, every time you make a commit, git takes a snapshot of whatever you have, all the files, and saves them as is, as a snapshot, and this is the proof. The commit references a tree, which is a directory, which has blobs that are files, and each file is the entire file.
22:46
So when you use the diff command, and look at the commit referenced by head, which is the one we are looking at right here, let's do the show command by the way, you can see that this diff is calculated on the fly,
23:05
comparing the snapshots of the one you have now from the previous one. Of course, this would be kind of wasteful for your, it will be a pity for your hard disk if every time you make a commit, it's an entire copy of everything. So git is smart about storage and can actually save,
23:23
it can calculate the diffs among a series of commits, which is called a delta chain, and save only what's different. But that's just an implementation detail for the storage subsystem. Conceptually, you can still think of it as a snapshot for every commit,
23:40
and this really helps. So to summarize, what is git? Git is a bunch of snapshots of your files and directories, and on top of it you add a DSL. So if you think of git as a domain specific language to track the contents of your files and directories,
24:04
then it makes much more sense why you should use it from the command line, because it will be just like another programming language. So it will be like having a time machine, however one that can only go back in time, cannot go forward, and on top of it you smack a command line DSL,
24:25
and there you have git. So let's redefine this git is hard. Git isn't hard at its conceptual level. Git syntax is hard.
24:40
And actually, and I'm sorry to say it, there is no good reason why it should be that way. The only reason it's like that is because programmers are programmers. They have lots of opinions, different ways of doing things. You know there is a way of saying if you have too many chefs in the kitchen,
25:01
they spoil the soup? Now imagine if your kitchen was full of programmers. They wouldn't be able to make a sandwich. So git syntax is hard, however the concept is simple. And let's summarize it a little bit.
25:20
You have the commit. That commit is actually only the metadata about what you committed that contains the author, the timestamp, and the committer, and the message of course. All that information is hashed and it produces a string,
25:43
which is the commit ID. Now a commit has a reference to a tree, which is a directory. Now a directory, which is your repository. A directory, a tree can contain other trees, or it can contain the actual files, which are called blobs. Now this, the commit has a reference to its parent,
26:02
which also has a tree and the blobs. And in order to reference lines of history, we have branches. And branches are just files pointing to a commit ID. So that's why they are cheap. You're just creating a 40 byte file.
26:23
When you say git branch, you're creating a 40 byte file that contains a commit ID, as you have seen. So now that we have this mental model, if you keep this mental model when you're working with git, all the commands, well not all the commands unfortunately, but most of the commands will suddenly make sense.
26:42
Because you will see how they manipulate this model. Alright, so now that we have this information, let's do something useful. Now, I guess that when you use git from the command line,
27:02
you use git log a lot, right? You want to see the history of your commits. So you say this, for example. Now how many of you use git log like this vanilla? Alright. A few of you. Now, you might not know that since programmers are programmers,
27:25
all of git commands take a wide range of options. Not all of them are useful. Some of them are deprecated. Log has one called pretty. And this one can take any number of strings.
27:40
What if we say that we want to format the log just one line at a time? And this will give us a more succinct output, more useful. However, it doesn't contain enough information for us to be, it only contains the commit ID and the message.
28:01
However, we can do better than this. So, let's use pretty again. And this time, let's change the color of our standard output to red. Then let's print out using a placeholder, and those are documented.
28:20
There is a list on the git documentation of all the placeholders you can use when you format commits. And every placeholder represents a piece of information. For example, the commit ID, the commit message, the timestamp, those have different placeholders. So, h is the hash. So, let's print that out.
28:41
Then let's reset the color. Then let's put a pipe symbol, because why not? And then let's change the color again to green. And now let's print out d, which, can you guess what d stands for? You will never guess.
29:00
It's the references. Why not call it r? I don't know, programmers. So, these are the references, the named references pointing to the commit. They would be like branches or tags or head. So, let's reset the color again. Now, let's print s.
29:23
Now, this actually makes sense. In git, there is a convention about commit messages. They should consist of two parts. One sentence summary that describes it in at the most 50 characters, and that's the summary, that's why it's called s. And then you have the body, which is a longer paragraph,
29:42
which contains details about the commit, if you want to explain more. Now, you might have seen this kind of style in some GitHub repositories. So, s is the summary. Now, let's change the color again to yellow. And now, let's open up parentheses, and I want to print the timestamp, c,
30:03
but I want it relative to now. I don't want the absolute timestamp, I want it to calculate from now how far ago was the commit made. So, relative. And let's reset the color again. Is this better? This is actually a little bit better.
30:22
Now, of course, all you have to do is just type that. Every time you want to do git log. Now, of course, you won't do that. Now, there is another feature that you might or might not know called aliases. Now, in git, you can associate any combination of commands and options, even shell functions, to a name
30:42
that you can use just like any other command. And we can associate this entire string to just an alias, which we use by using the git config, which changes the configuration options. There is a section called alias, then we call it lg, and then we take this entire string
31:02
and associate it to that, so then we can write git lg. Of course, you can't use log because it's already taken. All right, so that's interesting. That's a better way to look at history. Now, let's talk about commits.
31:20
I don't know about you, but I like my commits small and focused. Why? Because it's easier to reason about a patch that makes one logical change, and it's contained in size, right? We don't want to see, well, I'm sure you guys have also seen it. I still remember it. There was a project a long time ago where one of the contributors to the project made the commit,
31:43
and this wasn't even git, it was TFS. But that's beside the point. This commit contains 724 files. And can you guess what the commit message, well, the check-in message was?
32:00
Good guesses, but it was many changes. With a dot. Many changes. That's not useful for anyone. So we want commits to be small and focused, and every commit should make one logical change. However, in other source control systems, this rule is kind of hard to follow
32:22
because you need to adapt your way of working in order to make sure that you only make one change at a time in your working directory, right? You can make a bunch of changes because then it's hard to separate them into different commits. But with git, you can do that because your workflow is separated from your history. So let's open up our file, which is just this,
32:44
and let's make two changes. Let's remove the static word. Well, let me show you. Remove the static word, and then let's write a comment here, like, do something useful with the args.
33:02
Now, we have made two changes to our file. One is a comment, and one is an actual modification of the program. We wouldn't want to make those two part of the same commit because those are actually semantically different. So how do we do that? Now, I'm sure you guys know about the staging area or the index.
33:24
That's something that no other source control system has. And so the staging area is this kind of in-between area that sits between your working directory, that is the files that are modified on disk,
33:41
and your next commit. So you can pick and choose what files are going to be part of the next commit. And this is something that turns out is really useful, especially in this scenario. However, what you might not know is that you can also add part of a file to the index by using the add dash dash patch option or dash p.
34:06
Now, what Git does is that, and you know what I told you about programmers? Look at this. What is that? So let's first talk about this word.
34:21
Stage this hunk. Now, this, I found out, has nothing to do with good-looking men. However, every time I see that, I can't get the picture out of my head. So a hunk is actually a name in the universal diff format, which is a specification. It is part of the POSIX standard.
34:42
And what it means is that it's a portion of a patch. So it means, do you want to stage this portion of a patch? And all these options that come afterwards, well, I promise you that what we are interested in here, well, y means yes and n means no.
35:01
That makes sense. Then you have quit. Then you have a, which is all. Let's just stage this entire file and move on to the next one if you have manual. D means discard. That means just exit and go on to the next file. This one actually I don't know. It might be a separator.
35:22
S is split and E is edit. Edit just opens it up in your default editor and you can just change what the patch is going to be. What we are interested in here is S because we don't want to stage this hunk. We want to split it into smaller ones. So let's say S. And now it asks me, do you want to stage this part, this line?
35:42
And I say yes. Do you want this one? No. And as you can see from this prompt, now we have a split situation as I like to call it. The same file is simultaneously staged and unstaged. Now how do you look at what's going to be part of the next commit?
36:04
You say git diff dash dash staged. And it shows you just what's going to be part of the next commit. If you want to see what's left in the working directory, you just say git diff. All right? Now at this point, we can just commit what we have
36:24
and let's call it D. And let's see that tool over there. I forgot to mention in the beginning that this tool is up on GitHub. It's called C-Git. C as in with S-E-E. However, it's very buggy.
36:41
It's held together with duct tape. So during this presentation, I'm going to have to exit it and re-enter it just to make sure that it refreshes correctly. And if you are really unlucky, I might have to reboot my computer. So be warned, it might happen. So now let's add the rest of the file.
37:01
Just add everything. And let's make the commit. I don't understand why when I want stuff horizontally, it keeps putting them over there. So now we have two commits. Now, let's do something more interesting. How much time do I have?
37:21
Until 40, right? Right. Okay, now let's see that workflow is keen here. We made two commits, but we are on the master branch, which is the main line of development. What if we suddenly realize after the fact that those two commits were actually part of an experiment?
37:43
So we actually wanted them in a different branch. Now, if we were using subversion, no, tough luck, sorry. Those commits are already pushed and everyone have them. So, but in Git, we can actually change our mind. So let's create a branch and let's call it experiment.
38:02
And let's create it right here on the commit we are on, that is head, but you don't have to say it because it's implicit. And watch what happens now. I created a 40 bytes file containing the ID of the commit D.
38:20
Now, what I want to do is take master and move it back so that master point to whatever it did before I started making those commits, which in this case is commit C. Now, how do I reference commit C? Remember, the only information I have is given a commit, I know what the parent is.
38:41
So either I reference it directly using the commit ID, that is the hash, or I can say from our name reference, in this case, head or master, I can say a certain number of commit backwards. So I can say git reset, which is the command to move a branch.
39:02
And I really mean it, so I'm going to say hard. No joke. I really want to do it. Now, hard actually means that I want to move the reference and I also want to modify the index and my working copy so that they match the destination commit.
39:21
So reset hard and I'm going to say from master, how many commit backwards? One, two. So I say tilde two. The tilde is the syntax to say from one a certain number backwards. You can only go backwards. So let's do this and bam, master has moved.
39:43
So now I actually have two different branches. So now let's open up our file and let's make another commit. You see static is back here and there is no comment. So what do we want to do? Oh yeah, I don't think this is used, right?
40:01
Let's do something useful. Let's remove an unusual statement. Now I'm just going to, by the way, speaking of aliases, I have an alias called commit all, which does commit dash dash all dash dash m and I can say message.
40:21
So when I commit everything in my working directory, that is stage everything and commit everything, that's modified, I can say commit all and the message. So let's call it f. I use a lot of aliases. So let's watch me rearrange this
40:41
because this tool, I don't know, it's a WPF application, does it matter? It certainly doesn't make things easier. So now we have again a split situation like we had in the first demo. We have C and then we have two lines of history going from C. One is master, one is experiment.
41:00
Now something that's really useful when you work with branches is to know what commits are in what branch. Now you might notice that those are gray and these are not. What this tool shows you is a visual representation of the concept of reachability. I don't know if it's a real word, but in Git it is.
41:23
A commit is said to be reachable from a reference if there is a chain of patterns that lead from the reference to that commit. So in this case, head, which is the working directory, the commit with the contents in the working directory,
41:43
there is no path from head to these two. You see, because it goes to C, B and A. However, if I use the checkout command and do a checkout experiment, which moves head, you see that these two now are reachable and F is not reachable anymore.
42:02
All right, so now how do we know? I have an alias called CO. How do we know which commits are in experiment but not in master? Any guesses? Remember that this is made by programmers. What I want to use is that I want to use my alias, LG, and then I need to say, Git, show me all the commits
42:24
that are not reachable from the first reference, that is master, dot dot, but that are reachable from experiment. And I get DNE. Now you can make an alias for it and I have it called new, Git new.
42:42
Git new shows me the commits that are new in my current branch that are not in master. Git new. So if I change the order of those and I say experiment, master, that is show me the commits that are reachable from master but not from experiment, I get F.
43:02
This is called a two dot notation or dot dot notation. However, if there is a two dot notation, how many of you would guess that there is a three dot notation? A few. So I guess the rest of you doesn't think there is a three dot notation. Well, you would be surprised.
43:22
Let's try the three dot notation or dot dot dot, depending on how you like to pronounce it. It shows me those that are reachable from one or the other but not both. And this is useful in some commands.
43:40
But if there is a three dot notation, do you think there is a four dot notation? Remember this is made by programmers. How many of you think there is a four dot notation? More now. I'm sorry to say that there isn't. Yet. Yet. So remember Git is still being developed on.
44:01
So this is the two and three dot notation. Alright, now let's make something, let's make a merge. Let's merge the experiment branch into master. So use the merge command, right? Now you see that since these two lines of history have diverged, when I say Git merge, it's going to create a new commit,
44:22
this time as two parents. Let's say that we want to associate it with the M message. So let's, well, now, alright. Let me just get out of this. I told you that I was going to do that. Can I get in again?
44:41
And hopefully this will look right, yes. Alright, so now we have a merge commit. Now a merge commit is a special kind of commit that has more than one parent. It can actually have more than two. It can have X number of parents because you can merge any number of branches by using different algorithms.
45:01
Now let me tell you this. If you have a large number of branches in a project and you want to merge all of them, to minimize the number of conflicts, there is an algorithm you can use and tell Git to use, which is called Octopus. I rarely, I never use it actually.
45:21
But it's good to know that there is. So we have this merge commit. However, remember this only ever exists in my local repository. No one has seen that experiment branch. So if I now were to push to a remote, now everyone would see,
45:41
they wouldn't see the experiment branch because that reference only exists in my computer. However, they would see the merge commit and they would wonder why on earth has anyone merged out, branched out and then merged back in. Now there is this discussion going on
46:00
in the Git community about if you should keep merge commit or not because actually there is another way of merging that doesn't involve merge commit, as I'm going to show you. Let's first look at that and then let's talk about this dilemma of what is real history and what is not.
46:23
In this case, we have this so-called topic branch, a branch that we created for our own sake. We want to share it with everyone. I am the opinion that nobody actually cares about that branch. They just want to commit. So they are not interested in dealing with merge commit.
46:41
So let's remove that by using the reset command, hard, and this time I'm going to say just had minus one. This has the effect of moving the master reference and now you see that the merge commit is gone because in Git there is this rule that if a commit is not reachable by a reference,
47:05
it's unreachable and it's going to be eventually deleted. Now, it's still in my repository and it will be for a long time. Actually, it will be for 30 days before it's garbage collected. So that's why it disappears from the graph
47:21
but it's still there. What we want to do now is that we want to bring DNA on the master branch. However, we don't want to lose F. So what do we want to do? We want to take D and E and change the parent of D from C to F
47:42
because wouldn't it be nice if we can just do this? Move D and move it so that it's on top of F, like this. Wouldn't it be nice? So it appears as if D and E came after F and they will give us a nice line of history.
48:02
One line. Because remember, that branch nobody cares about. I just created for myself. So we want to change the parent of a commit from one to another. Or, in other words, we want to change the base of a commit from one to another.
48:21
Or, in other words, we want to rebase. Rebase is just changing the parent of a commit from one to another. And when you change the parent, of course, you are going to change also all the commits that come after it. So in this case, I want to say,
48:43
let me first change to check out. Let me first change to experiment. And you see that now F is not reachable. And now let me say git rebase. Where do you want to rebase it? What is your new parent? F.
49:01
So I can use the id of F. Or I can just say master. Now, don't watch the prompt. Watch the graph. If this trick works, I don't know, let's see. Remember, don't watch the prompt. Watch here what happens. Now git is going to remove D and E and then it's going to remove it and then it's going to apply them and BAM!
49:24
You did watch the graph. Right? So rebase did exactly what we wanted. It moved the parent of D from C to F. Now what's even more interesting, let me just clear this, is that if we move back to master,
49:43
if I do a merge now, git is going to see that, uh-huh, you want to merge experiment. But experiment is already a descendant of master. So I don't have to create a merge commit with multiple parents. Instead, what I can do,
50:02
is that I can simply, maybe some of you already know it, move forward the master reference so that it points to the same commit as experiment. And this is called a fast-forward commit. I don't know why it's called fast, but it certainly is fast. But it's definitely forward.
50:22
And with fast-forward commits, sorry, fast-forward merges is the other way of merging that git has that no other source control system has. As opposed to true merges. True merges is where you actually see the merge commit. Now there is this discussion going on,
50:41
and what is best? What do you want? Some people say, you always want the visible merge commits. For two reasons. Why? One is because that's what actually happened. And they have this belief that history should be
51:02
you should record what actually happened. You shouldn't be able to tamper with history. So if a merge commit happened, even if it only exists on your computer, that's what should be in the final history that everyone sees. There is one school of thought. The other school of thought says, we want history that's clear.
51:22
We don't care about what actually happened. Because if we keep history as is, then we're not taking advantage of all the features of rewriting history and making it clear and easy to follow. So if a branch only exists on my computer and nobody will ever see it,
51:40
there is no point in having a merge commit. It just clutters history and creates something that nowadays I've seen is being called for the guitar hero history. It's where you have a lot of branches going out and then they merge back in. It looks like the guitar hero stuff. So there are these two schools of thought.
52:01
Now what you pick is up to you and your team, of course. I certainly prefer having a clear history. That makes sense. And if I can avoid merge commits, I do, because I want a linear history. So my rule of thumb is this. If you're merging a public branch into another public branch,
52:21
that is a branch that is being shared by multiple people, in that case you want a merge commit because you want to see where these two branches that are public have merged. If you're merging a private branch, one that you all existed in your repository with a public one, then nobody cares about that merge commit. In that case, you just want to keep rebasing
52:42
your private branch on top of the public one as it moves. And when you're finally ready, you do a fast forward merge. That is my rule of thumb. Now for the final rush before launch, let's do something a little bit more interesting.
53:02
Now, you know what? I have a build script as all projects. And this build script, in this case, just compiles the file. All right? So let's compile this. It doesn't compile. It probably never did.
53:21
I don't know because I'm running it first now for 50 minutes in the presentation. So I have this problem that I don't know where it broke. I have no idea which commit broke my code. Do you guys know which commit broke my code? If anyone knows, let me know.
53:40
So in situations like this, what is the first thing we want to know? Of course, who did it? And if it turns out that I did it, then we want to rewrite history. So that it never happened. However, how do we know? It turns out that Git has this nifty feature,
54:02
where the name actually makes sense, called git bisect. Anyone heard of git bisect before? A few of you? All right. So this makes sense and I'll tell you why. What we do with bisect is that we give Git a range of commits where we suspect the problem is,
54:23
sometime in between these two commits. Then what Git is going to do is that, okay, let's find the middle commit in this range and let's move ahead to that one so that your files on your working directory are matched to that commit.
54:40
At this point, you can run your build script, your tests and see, is this commit, how are things here? Are they good or bad? If they're bad, this means the problematic commit must be earlier than that. Right? So let's go to the first half, let's go to the first half of the range
55:00
and let's, in this case it will be like C and let's go to the first half and find the middle commit again. It will be B. Let's move ahead to that one. Now tell me, are things good or bad? And you keep going, if things are good that means that it must have happened after, so you move to the upper half. So what you're basically doing is
55:23
while dissecting your code. So you're doing a bisect. Smart name. Let's try it out. So you say git bisect start and now you have to give it the two commits. The first commit is where things are bad and we know that things are pretty bad right now
55:41
so let's say that head is where things are bad. Now you have to say where were things known to be good the last time you checked and I have no idea so let's say that A is where things are bad. Sorry, good. So from master is like one, two, three, four, five.
56:04
And then what git does is that you see it places head on the middle commit and now it asks us to run your build script. Now if you have a long range you wouldn't want to keep doing this manually so you could take a while. But if you have a build script
56:22
that runs all your tests and does everything you can just tell git to run that at every step and decide based on the error code from that build script if the commit is good or bad. So if the build script exists with zero code then git is going to think that that commit is good. If it exists with other than zero then it's bad.
56:44
So luckily for us we have it so let's say git bisect run and we have msbuild so git is going to keep going and at each step run this commit. So at this point I can go for a cup of coffee and git is going to run this and it's going to say at some point that
57:02
I think I found is the first bad commit. It's B. And head is of course there. So let's verify by looking at the patch and indeed
57:20
this is nobody spotted before. This has been there the whole time. This is the commit that introduced the line that created the problem. Now as I suspected I am the author of that error.
57:40
So let's pretend that this never happened because I haven't pushed it anywhere. So let's first do bisect reset and let's go back to where things were before. Now what is the easiest way to fix this? Now this is totally don't mind this anymore because let's just
58:01
let's have it like this. What is the easiest thing to do? Remember commit B is the one that we need to fix but we are on commit E so it's much earlier. Well the easiest thing to do is to make a new commit right now and go to this line and just fix the error
58:22
right? Then we run the build script it's called build.target and verify that everything works now. So now we've fixed the problem. Now what we want to do is that we want to take this patch and make it part of the original commit B
58:42
because that will make it look like this never happened. And how do we do that? Well first we make this commit well let's add it to the index and how many of you know about interactive rebase? Okay a few of you. Now when you rebase
59:00
and you know you change the base all your base are belong to us and so when you do that there is another option that allows you to basically rewrite the history the way you want and the way it works is that you say rebase and then you say dash i or interactive
59:22
then git opens up the list of commits in that range and then it tells you what you want to do. You can move the order you can delete them and you can squash two commits together and make it one. Now what we want to do
59:41
is that we want to take this commit that we just made well we haven't made it yet and make it part of B and there is a way to make it real fast. It's kind of an advanced trick but I think it's worth showing. Let's make the commit method the string fix up exclamation mark.
01:00:01
And then let's tell Git what commit do you want to squash this into. That is B. So this is the commit message. So I'll show you the history. This is looking like this. The committed fixes thing is called fix up and the beginning of the commit message that I want it to be squashed into.
01:00:26
Which is B. Now at this point I can just start an interactive rebase and I want to go all the way back to the beginning of my repository's history. So I say root. And what happens now is that Git automatically takes this commit
01:00:44
and it moves it right under commit B and changes the word pick from fix up, which means squash it together. See what has happened? It has moved under B and it changed to fix up.
01:01:01
This is our action plan when we want to rewrite history with an interactive rebase. So then we save the file and Git is going to go back, move the commit, squash it together into B and then update history. Takes a while but it's worth it.
01:01:24
Don't mind that history. So if I look now you see that the fix up commit is gone and if I take a look at the patch of commit B by grabbing its ID you see that the error is gone.
01:01:43
It never happened. This is a great example of rewriting history to hide mistakes. Alright, I have maybe two seconds left so I'm going to wrap this up real quick with three concrete pieces of advice.
01:02:02
When you are either learning Git or you want to take the next step in the way you use it number one is learn gradually. Now Git is this massive list of commands and options. There is no point in trying to absorb all of it.
01:02:21
Just learn a few commands, the one that you find useful right now in your current workflow and master them, no pun intended. And when you know them good enough then you can move on to new ones. And I can assure you if you find yourself in a situation where you say I wish I could do this, there is at least three ways to do it in Git.
01:02:45
You just have to Google it up. So that's the first one. Number two is find your workflow. And with that I mean do not ask yourself what you can do for Git. Ask what Git can do for you.
01:03:02
Because Git is open to, it doesn't impose any workflow on you. You can use it the way you want. We have seen that some have tried to document different ways of using it. For example the Gitflow way of branching is an attempt to document one way that many people have found useful.
01:03:20
But it's nothing that's required by Git or recommended, just one way. So find your own workflow for you and your team and find out how you can use Git to support that. That's the second one. The third one is keep a clean history. Because with Git you can do it.
01:03:41
And with that I would like to say thank you for your attention. And one more word before I leave you is that of course I had only one hour. So there is much more I would like to show you that you can do with Git. So if you want to know more and you have a process subscription I have a course out called Advanced Git Tips and Trips.
01:04:02
Tip, Git, I can't even pronounce it. And don't let the advanced scare you. They made me say advanced. It's not really advanced. It's intermediate. It's basically hard to find a sweet spot. But there may be something that you didn't know about Git in that.
01:04:20
And you can find it at that URL. And with that I'd like to say thank you.