A journey into Git internals with Python

Video in TIB AV-Portal: A journey into Git internals with Python

Formal Metadata

A journey into Git internals with Python
Title of Series
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license.
Release Date

Content Metadata

Subject Area
A journey into Git internals with Python [EuroPython 2017 - Talk - 2017-07-13 - PythonAnywhere Room] [Rimini, Italy] Despite 12 years of history and wide popularity the workings of Git still remain largely a mystery for many. A lot of Git users operate it just by remembering a bunch of commands and repeating them in a correct order. I was one of them until I decided to dig deeper into how Git actually works and suddenly I understood that internally Git operates by rather simple principles and after you figure them out suddenly all those commands start to make sense. To look into the Git's internal structure you need a programming language to crunch the data and Python fit perfectly for this task. In this talk, I will dig into the internals of Git with Python, that will help you better understand how Git works
Data conversion
Word Standard deviation Implementation Projective plane Horizon Right angle Twitter
Area Weight Content (media) Content (media) Machine vision Revision control Word Software Torvalds, Linus Revision control Video game Right angle Object (grammar)
Revision control Word Software developer Source code Right angle
Word Software Network topology Virtual machine Normal (geometry) Mereology Data transmission
Building Kernel (computing) Block (periodic table) Software developer Number
Revision control Kernel (computing) User interface Software developer Interface (computing) Content (media) File system Barrelled space Freeware Physical system Computer programming Address space
Collaborationism Game controller User interface Key (cryptography) Collaborationism Data storage device Grand Unified Theory Control flow Mereology Revision control Revision control File system Right angle Physical system Tunis Installable File System Fundamental theorem of algebra
Key (cryptography) Data storage device Content (media) Branch (computer science) Database Data dictionary Mereology Revision control Hash function File system Energy level Right angle Data structure Chi-squared distribution
Computer file State of matter Execution unit Source code Insertion loss Mereology Computer programming Theory Revision control Mathematics Different (Kate Ryan album) Object (grammar) Hash function Repository (publishing) File system Address space Projective plane Data storage device Content (media) Cartesian coordinate system Hash function Repository (publishing) Order (biology) Right angle Object (grammar) Quicksort
Server (computing) Computer file Repository (publishing) Network topology Normal (geometry) Social class
Digital electronics Computer file Repository (publishing) String (computer science) Hessian matrix Repository (publishing) Software repository Data storage device Set (mathematics) Object (grammar) Hacker (term)
Hash function Cone penetration test Computer file Software repository Repository (publishing) Data storage device Flag Right angle Object (grammar) Pressure
Type theory Computer file Hash function Software repository Repository (publishing) IRIS-T Data storage device Content (media) Right angle Object (grammar)
Functional (mathematics) Computer file Content (media) Object (grammar) Mereology Proper map
Pulse (signal processing) Functional (mathematics) Group action Computer file Network topology Content (media) Right angle Object (grammar) Drum memory Mereology Event horizon Number
Functional (mathematics) Information Computer file Length Data storage device Content (media) Regulärer Ausdruck <Textverarbeitung> Mereology Number Power (physics) Type theory Blog Object (grammar) Mathematical optimization Reading (process)
Functional (mathematics) Computer file Repository (publishing) Content (media) Object (grammar)
Medical imaging Asynchronous Transfer Mode Mathematics Computer file Commitment scheme Repository (publishing) Electronic mailing list Right angle Object (grammar)
Point (geometry) Functional (mathematics) Computer file Sheaf (mathematics) Letterpress printing Mereology Data dictionary Metadata Number Term (mathematics) Authorization Window Information management Email Block (periodic table) Commutator Content (media) Electronic mailing list Data storage device Type theory Pointer (computer programming) Voting Logic Network topology Right angle Object (grammar) Quicksort
Dependent and independent variables Functional (mathematics) Computer file Binary code Inverse element Mereology Regulärer Ausdruck <Textverarbeitung> Binary file Metadata Number Pointer (computer programming) Hash function Network topology File system Right angle Object (grammar)
Asynchronous Transfer Mode Computer file Repository (publishing) Software repository Directory service Object (grammar)
Trail Sparse matrix Crash (computing) Computer file Term (mathematics) Bridging (networking) Branch (computer science) Water vapor Table (information) Disk read-and-write head
Asynchronous Transfer Mode Pointer (computer programming) Branch (computer science) Object (grammar)
Point (geometry) Email Functional (mathematics) Inheritance (object-oriented programming) Computer file Commutator Content (media) Directory service Field (computer science) Pointer (computer programming) Network topology File system Representation (politics) Right angle Object (grammar) Quicksort Logic gate
Asynchronous Transfer Mode Inheritance (object-oriented programming) Computer file Electric current
Computer file Insertion loss Network topology Electronic mailing list Object (grammar)
Ocean current Computer file Source code Content (media) Electronic mailing list Cartesian coordinate system Number Revision control Mathematics Search engine (computing) Right angle Object (grammar) Address space
Point (geometry) Mathematics Quantum state Computer file Bridging (networking) Software repository Counting Branch (computer science) Object (grammar)
Word Computer file Object (grammar) Electronic mailing list Branch (computer science) Information Quicksort Object (grammar) Disk read-and-write head Newton's law of universal gravitation
Asynchronous Transfer Mode Mathematics Pointer (computer programming) Computer file Computer file Branch (computer science) Price index Branch (computer science) Game theory Newton's law of universal gravitation
Email Pointer (computer programming) Well-formed formula Line (geometry) Plotter Branch (computer science) Right angle Message passing Monster group Product (business)
Graph (mathematics) Inheritance (object-oriented programming) Link (knot theory) Network topology Range (statistics) Branch (computer science)
Email Information Inheritance (object-oriented programming) Computer file Block (periodic table) Multiplication sign Content (media) Commutator Motion capture Branch (computer science) Metadata Mathematics Network topology File system Authorization Energy level Right angle Recursion Metropolitan area network Form (programming)
Point (geometry) Computer file Operator (mathematics) Source code Data storage device Pressure Dirac delta function Resultant
Point (geometry) Mathematics Pointer (computer programming) Hash function Network topology Electronic mailing list Content (media) Right angle Branch (computer science) Pressure Number
Mathematics Computer file Inheritance (object-oriented programming) Different (Kate Ryan album) Object (grammar) Semantics (computer science) Demoscene
Point (geometry) Inheritance (object-oriented programming) Dedekind cut Different (Kate Ryan album) Content (media) Branch (computer science) Right angle Object (grammar)
Computer file Source code Data storage device Commutator Line (geometry) Mereology Dirac delta function Rule of inference Revision control Speicherbereinigung Energy level Right angle Object (grammar) Quicksort
Standard deviation Repository (publishing) Internetworking Line (geometry) Social class
Computer icon Software State of matter Source code Computer icon
Context awareness Presentation of a group Cache (computing) Quicksort Mereology Logic gate System call
Area Subject indexing Mathematics Repository (publishing) Network topology File system Branch (computer science) Diagram Mereology
Area Curve Computer file Content (media) Independence (probability theory) Directory service Branch (computer science) Mereology Area Revision control Cache (computing) Mathematics Repository (publishing) Different (Kate Ryan album) Network topology Repository (publishing) Right angle Diagram
Area Computer file Repository (publishing) Representation (politics) Directory service Branch (computer science) Area
Area Point (geometry) Group action Computer file Decimal Weight Moment (mathematics) Branch (computer science) Mereology Metadata Revision control Type theory Mathematics Repository (publishing) Right angle Object (grammar) Quicksort Resultant Metropolitan area network
Integrated development environment Key (cryptography) Block (periodic table) Forcing (mathematics) Similarity (geometry) Directory service Branch (computer science) Right angle Binary file Area
Covering space Dataflow Matching (graph theory) Computer file Block (periodic table) Source code Database transaction Branch (computer science) Disk read-and-write head Metadata Coefficient of determination Latent heat Pointer (computer programming) Root Quicksort
Ocean current Complex (psychology) Group action Inheritance (object-oriented programming) Information Content (media) Electronic mailing list Branch (computer science) Mathematics Differenz <Mathematik> Different (Kate Ryan album) Logic Term (mathematics) Single-precision floating-point format Software testing Right angle Data structure Computer-assisted translation Marginal distribution Resultant
OK well I a few questions via while people are still coming up who uses at least once a week the great majority of the conversion is it who wants to learn about did internals I guess just shaking like if you didn't come here by mistake so you know what you're looking for um OK yeah so I find a given trans quite interesting and an outlook to fear it and it use the Python to help me explore them
why should we even looking to get internal Sorites well why not is a good enough
answer for me right but I need to convince you why it's worth a subject of investigation the I think 1 of the main reasons is uh better understanding the 2 if you know how internals worker right it can lead to more efficient use every day and 2 more easy understanding of stuff you re the main trend about it and then you can learn the ideas because there are a bunch of quite interesting innovative and creative ideas that were put into the design and implementation and here we can broaden our horizon and we can use standard of them in our own projects and of course a healthy curiosity I want to watch a on each of those kind of you just like all words were they
explain you know how every day objects were can operate so yeah that's kind of this talk right how stuff works but Indian software world and please raise your hands cool thing that give this
cart or at some point of like their life said damn get is hard well again the majority rights so I think the main
problem why did as is that because you
is a largely misunderstood and this is from 1 of the area you treat me files so when the Estoril throw their weight in the very 1st version he described the tool as this stupid content tracker the key word here is stupid right so it implies stupid not as not smart but stupid as you know simple right so it was designed in vision to be simple let's look at a very brief history of the 1st version of gives so we store
Toros himself began development of dude on the 3rd of year doesn't matter and it's about days and then basically a she had a working version of 3 days and then she started using it to cause they give source itself In 1 word
in 1 more day and basically within the month she managed to achieve his Performance Goals right is incredible is doesn't sound like a car tool was something that was the 1 person or and maybe because Mexico but it was mostly he managed to do in 1 month right so what it can be
actually like part about it yeah uh it should be simple was designed to be simple but then something went wrong so the normal ways of using tools whether a software or anything machines you know
and devices is that normal people a 1st learn how to use this thing master you know how
apparatus and then maybe maybe learning transfer right you don't really need to know how the car engine works her transmission words internally to be able to a precursor must rest on the tree know that much about the internal but the give
was envisioned and is and the other way around right so it's important to note is that we give was created by that number geeks
right not like and you know every day geeks but the Linux kernel developers wasn't isn't billions kernel developers for instructional developers and uh there idea was that and if you create simple building blocks and people
here and them like the usage will become self evident right then they basically it felt that if you 1st the internal then
you will automatically understand how to use the tool well and my guess that's where the things went wrong something that works for Linux kernel developers doesn't maybe what works for you know why programming
community so that's why I think it's quite useful important tool can be given internals so what is geared it is the
court affirmed probed you believe that a recommend it's free official book on you to get is fundamentally a content addressable
file system with version control system interface written on top of it so the compliance the barrel layers that there at layers by design so you might be wondering what use the content addressable file system I
will skip this the 1st 2nd and I'll come to it later so was a couple of early leg into 1
players and so basically on the very basic where the duties key-value store which is a content addressable then on on top of these key value store is a file system built right and just on top of that the version
control part is built and then on top of the version control parts the collaboration tuning is built and
so speaking about comments right to the protection but version control level it is about comments like git checkout I did communicate branch of the stuff and cooperation level is you push full fetch and will work in the remote proposes this talk will if speak about the 1st 2 levels right that and have their own comments have their own structure but somebody
reuse every day directly so um T. register right the lowest level of give so well usually the ferrochelatase or whether it's a Python dictionary war or some database the fundamental principles that you can provide a key and store any arbitrary value by that key right so this key could be anything right you generated it doesn't necessarily need to be a part of the content of the value that's normal wage so the I said that give this content addressable and file system or key-value store so what does it mean that it basically means that they key to the value of its value itself right but it looks like it doesn't make a lot of sense right of content can give a key of itself it doesn't sound useful and some really gonna work so what of their give designers greater than they came to idea that you can use a hash of the values its key
right so it basically means that he depends on the value and that's what content addressable part means it the teens means that you can refer to the value by hash and um well injured terminology and the fish is strong and valley is object basically object here that have nothing to do with the you know how um of theories of programming object is just that any I thing that did store state units and you had a store so there are 2 important implication here is that 1st of all content is immutable basically because if you have a file it has certain fresh and then you change the file content it has changes
right so basically key changes you uh their content changes itself right it means that the other objects are linking to that original file right of wants to change the content of the key changes and all these give broken basically that's very important implications so it all the objects and you art mutable by that sense uh but there is a good news uh there no content application right if you try to sort 2 identical things in give key-value store they will be not 2 copies of the same thing right because it has the same key and there is no need to sort of that 2nd version because it is the same objects it and same version of essential it loss of 2 were important implications on different order of things injured OK so as more interesting fact so basically what is repository can give when comes a file system is just this don't get folder in your project folder right all the files that you have decided that give falls well basically your source code there actually themselves more part of a
Git repository they're called the working tree and they is just a way to interact for gait with that of Fig normal classes and a very important fact that you can actually set up a Git repository without having worked tree at all it's called a bear repository and that's what is used on the servers OK now let's look what's inside this idea folder and so I'm gonna do this I'm going around a bunch of comments and then use by to explore or what actually the happens inside
and they give repository itself so let's 1st them create our poster OK as you can see it says initialized empty it repository the inductive and that's what I meant that it doesn't really include the files
so just go there and so they're storage for these q-value stuff and give is located in dude but given objects so it seems there is In a circuit there's a bunch of stuff and those are directors they're not files you because it's empty repository OK so as I said
the forever level engage very that's on a common set of commands so there is this command that will compute the Hessian of the object
so and object has a set can be in a string so
let's created this the poem python from the school right and then I'm gonna buy but intra give fresh object which pressures the object from as the in gave it just brings the hash well according to the
cone dude couple is the fact well in the best traditions of beauty and there is if you add a flight to this comment it will do something completely different so if you add
flesh w is the largest with the hedge but it will actually store their files in their key-value store right so was I think this 1 the prose of gives so a lot of comments have flags that completely change right for the commons do and this I think that was the other 1 of the because it is a mistake but was strike
so it also brings the have but now it's stores our object so analysts this he was there I will use type files because directors entry metaphor now OK now we see there is 1 more file
and its fast happens to be you well basically broke broken down fresh of a object is stakes 1st 2 letters and
then the rest of the a of the hatch
so you know let's try to write this phi would give rise so as I said the content is immutable and duplicated and let's see if there are more objects appearing on nope there is no need to store this spouse already there because it has the same kind so there will be you know let's do the assignments 1 more of their C to phone 17 school right analysts see if there are more objects right there is 1 more of because it's a different thing OK now let's
explore whether picture of those objects and so I created this and other but there no book that will help us to explore the content
of the wife and so here just about some proper functions that gonna help me to show some stuff and you know just gold and a year whistles user glob to you know inter-rater world that a file objects but there are 2 objects in our data poster let's create the function that will and you know least origins because we will use it for later and and just yes it clears up all this you know the the common parts of the files and will show
just them have basically precious so um OK was was being the 1st so the 1st part right and I wrote
this function the trees the file basically just of pulses and content in online OK that's so this seems like some binary gibberish and it basically the thing is that it gets uh is it's the contents of the object so we need to right now OK now we see the content of our object so we can see the 1st comes the war blob events number and then 0 bytes and I'll bite and then action the content of my object so what
is this blog blog is an object type so give stores object types inside well this it's key-value store and then account and this number is a
Content-Length common a character so there OK so now we know that the object has died and so this straightening
toppled and where we what we're gonna store
and our power subjects which will have a laugh type and then the content itself
again I was pretty and 2 small function that will help us to read objects in a new the what we use so far so read object function basically erase the file unzip says and then using a regular expression it takes at the type of the object and the content we don't length basically it length is just that you know In for information purposes and then was right the function that will iterate over all our object parts and at 1st the objects and was story variable I write to write on the spot again and I agree
this small function that to more easily plotted tables with with their own name
troubles and somehow the sentence OK we can see that there are 2 objects in our repository of both of them visited blob and we can see the content here it's the 1 with the your Python and now there's your wife and son 17 OK and then now have some stuff in a repository was go back into something more in a what we do every day OK
so let's create a file with story
like just the current date in it right what see
what's inside or nothing interesting adjusted with predictable so now as edit this file and commit and images on now let's see what changes in our repository right OK now let's and give the list of objects again and and see what's inside OK there are 3 more
objects and we can see the types of now beside the blob now we see 2 more types the tree and the committee uh well I commute we gonna it's the
export later and trees some sort of
also binary you brief of sorts and then we can actually see it the blob the file that we just as as a part of our common is you cannot is the blobs of the files they don't have names they pretty much nameless right so you put the content itself so basically it blob doesn't store the name of the file is important and you will see way which OK so let's extract that commit object and see what's inside I would consider section pretty readable text but just serve separated by new wines and let's spilled by newlines to see was there so we can see that this comitology it is pretty much Wenger textual metadata right it has Kethers right with has had retrieved author and coming true that happens to be me and actually that the text on the comment of the Committee right so what's interesting here is what is this tree because we can see the tree is points points to pressure all right let's creates more function it will converge a commuter headers into a dictionary we go maybe use Multitech Multitech this is just a some third-party dictionary that can actually hold several voters were stinky because those Harrison and unique some of them can appear twice so was used this helper functions pass for spoken I'll have a dictionary of and now I am right now the most important thing here is a tree arrays it seems like a pointer in terms of its affairs policy what is the point so what kind of logic points to and what its contents are and lets extracted from from from the headers and what's has this function that will interconvert the fish into the file path just nothing say OK now was let's all this tree object and print its contents and OK we can see that it has an numbers than the file name and then just some minor stuff basically what tree is it is a list of objects and that's it you know it contains metadata for objects 4 blocks and 4 other trees so it's certainly Christmas function that that's going to retrieve uh from our collection of objects by fish so we don't have to try this again because you look like from and
now was worse of this tree file and we can again will use regular special session quite call it can use regular expressions for binary files as well it's actually quite easy with to pass minor parts so this tree object consist of entries which are defined as the inverse there comes a bunch of numbers which actually happened to represent their Unix permission by public don't really care about them then goes the file off I that can contain lecturers and dust and slashes and then there is a hash of the file that it refers to in binary formant right this whether a 28 uh because in hexadecimal formant it's 40 bytes of the shock OK and now now was try this 1st function and see with gun out OK now that's something we can read now we see that the tree company consist of only 1 file and as pointers so we is a tree is something that contains metadata for for other objects and MAC tree connects the blob with the name of the file that actually responses so you can see is due to something right and where was
it here I did to so it points to this objects OK now this so pretty much get implements file system by this tree objects and gay
now let's just do more stuff with our repository well as do and create a directory called new dear and then let's just do the same stuff would just and describe the data file was gonna at this commit thank you
and now let's see what kind of objects appear in our repo after this the same function and
actually there's a table OK smaller more stuff that's already in the car and keep track of it usually so mn now let's a centered on what are actually branches in indeed there are
also the general term for bridges is Raphael reference and they even this that and the sparse it's Bob give slash ref slash heads and would have only 1 branch and its master so basically water and a branch is and get
it right is just a file that just points to crash of the current commit
right so we can actually go and see what we would do it in Our yeah
1 is just the Hessian of the latest common so it's important distinction so branches and you are not objects they are just Morelia pointers with the name to commit
and because they they're not objects they're actually mutable unlike the objects themselves and always extract this the the harsh and just now strip strip of from Wallace X buy stuff this is let's try to find the
commute objects of that in massive points to and let's have a look inside the contents muscle consented headers OK
this were master so commit objects as has the tree that we're on previously but not as a new field apparent right the 1st comedian have apparent obvious it was 1st 1st in so basically the parent points the previous committee in this aligned OK let's uh convert this the realizable don't function because we might use a later OK now let's look at the tree at the latest tree or the latest commit OK now we see beside the file that the it also contains a pointer to a new directory and it's a fresh so basically is just sort of a representation of the file system just inside the gate and now let's say you know let's compare it
with the parents and OK now let's do this let's add 1 more and let's say um make 1 more
data will see it you dear tell and let's do the same trick questions yeah more
file that should strike the dating gives a more again I was he would let's see what the
what did we had there until it's check again
while the list of objects keep growing growing growing was remaster again let's necessary and then OK let's see their masters tree OK notice 3 entries well that's interesting OK and all compared with the previous its parents treat and has the previous have tool and entries because well I just said of this new 1 so what we can see here that the file takes the wasn't changed in the latest commit rights and
since it was change is the same content might be dozens for 1 more version this file
because it is the same version right the only thing that's got different is that because the new entries so that's how did application works it never source the same thing twice because it's the same it has the same content that's what is the meaning of content addressable it is that the content itself it serves as an address for an
object in the search engine sent changing OK
now let's like our list keeps growing there's just you know more than and a certain number of and the current number of objects and now lives and do something more and with our repository OK let's create a branch and see what will happen what kind of objects will
appear OK I want to work on the bridge FutureGrid I branch now was moved to this branch switch the
branch OK now is a just again and see the count of object changed while surprisingly he
didn't change as I said the branches are not objects when we create the branch what happens is that we just and we
just create 1 more file points just the latest commits this is the same
yeah a fire early is is basically the same commit so there is no need to create anything you there so rest homes feature is just a
1 more fire OK lets them you know was the trigger words so that there are 2 of them and just list files and
compare that feature in the master are the same they point to the same file but as I said and branches are not objects so they have can have different names for the same stuff basically OK now let's actually do something our uh branch well that's there from Unitech technique yeah 2 more of something but you uh so what we're sort of working on them feature here
OK now let's see if the branches changes mom OK we can see that 1 of the branches are pointing out a different commit well because that's what we did with the new commit and when you add a community it reserves there were the pointer of the branch to this newly created from you can see the old 1 didn't change of course a
game now let's go back to master and then lets them away the something was in the meanwhile done in master it child of course it didn't create the file 1st and in an of 7 to give OK now let's do what now we're going to merge however
where future branch into production OK now emerged from it and you might get to call for me to write a text will
give generous for us automatic text right with stuff he just let's look was just use it OK now let's investigate and what is their current
master and commit what what's this what's special about no OK I'm extracting the
master pointer again and was plot the headers of the current monsters coming we see bunch of formula headers
but there's something new there are 2 parents and that's actually what
makes emerge from its special because the parents and because it's the
apparent committee that we worry the previous branch and then commits from another range that basically the ability to have several parents inside inside the uh commit is what makes did not the link but the tree and more specifically at acyclic directed graph basically it
doesn't contain any more information doesn't even say in their headers would branch was merged form just says there 1st commit and the 2nd coming and contents of are our the phone changes OK that's enough coding for doing then when let's just do small
recap of what clearance of file is a block it doesn't even have a name or a you know any metadata is is is just that well just specific content a tree is a least of blobs entries that and treats the recursive structure right it can point to files and other trees and trees that container or the metadata a simple man the file storage level now commute is is is a Topalov essentially 3 things essential the file system which is implemented by linking to the tree at the time that Brown commit cash for parents commit precious as the metadata like author and comment on that kind of stuff and so it's captures the file system changes and branch itself is
just the file is supplemented to
justify points to the last commits pressure the and um it's important aspect that's when we get doesn't store and the full the gift stores the full blobs so when you change the file source the whole file again this way that are very fast because it doesn't result deltas on all of this stuff and it just gets the file by the fish immediately that's why socialism in in a checkout bringin operation of the fast so what is merge merge is a special kind of with 2 brands that's it that's the only
thing what makes it you know merge commit
special and that's what makes it not linked list but a tree so OK now so that might mutability has very important practical implications and um yeah changes of content changes the pressure and the changes the have means that the thinking is gets broken and this has a very practical aspect of its history indeed is immutable as well not just object to them so the history is immutable because history is based on content linking so as they have this very simple history right to have 3 come it's all those like 2 numbers is just uh of hashes like where professional precious in the master pointer branch points to 6 was say we we decided to change and I just want to
change a file in a 4 right Our comment or basic anything in the 4 so if we change something in a for
it is essentially becomes a new object with its own have arrived
and that it will point to is the same parent commit to well this is the content changes is a different objects and scenes to give this content addressable and 6 b doesn't boy and tool 7 F. semantic signal nothing about it basically at this stage uh have someone is unreachable uh commit so if you
want to change the history we need to rewrite history can give it so what we're gonna do we're going to create a committee that has the same style that seeks to be but points
as a parent points to 7 at 7 right is the same content but the user different branch and then we can reserve the master reference to our newly created 85 commit and basically if want a tissue so we need to rewrite that right and then kill the previous 1 but their reference so the branches Archie mutable and we can switch them later so after that we can write thereon did GC in it will actually unreachable objects like a 4 and 6 B or you can kill the manually because they're not and which was so that delta are not really a
part of In will get because if we change the object even if it's large objects it will store and just a version of the same object there and those with this actually parts rule on a key store there indeed ontological indeed non-doped us but it's quite an inefficient right if you have a large file on you and just 1 line just sort of the same thing twice so give this his this district they have actually back files with the deltas uh but they're not logical levels they're completely transparent to this here sources and yes actually thank you run did JC which stands for garbage collection it will
and pick up to the similar stuff and they do it heuristically they don't do deltas based on their their relationship as a commuter the history they would just pick up similarly looking blobs and they say OK this blob is based on this
class of this line actually and it's actually works quite fast and otherwise a new large repository will be like unbelievably large OK now
we got the the basics of get internal standard I think I hope you're on something and I hope it will provide you like if and like fundamentalist to learn more even more on the internet but getting trials more things will more clear for you entered yeah that's just the beginning I hope for you for your learning the give internals because they're
among many more interesting things so far source and or network of delta actually war and all that kind of stuff I highly recommend that free and well public-accessible book probe its official gives book and it has the chapter about its internal state which is used
as the main source of inspiration and things discover I used their icons yet their license permits the noncommercial use of the icons yes and thank you very much for attention be thank thank you so much so we are any questions
most everything here OK I would there's with thank you very much for this nice journey through gate can you comment on how painting uh is represented in the this context was I knew I
knew that's why a prepared communion presentation a gallon of fuel method that's it yes 2nd thing that's like sort of mind blogging for for people who survive is what the hell is this stage the rise in the worst part this call
by 3 wars that seem to have nothing in common staging area is also known as index also known as cash so it it is intermediary between your file system and git repository which used by the tooling why I honestly I think 1 of the worst parts of it will be there to come once the first one will check out the branch with the name branch and 2nd command will purge hold your changes the local changes all right so the problem that dude seemingly doesn't completely unrelated things into
commands um but they're actually not that unrelated so the I think you so if you see in this diagram right this working tree if you want
to in alchemy something you 1st get and to staging area than did you come and gets into repository but this this picture doesn't diagram doesn't tell the whole story right
staging area is something independent something in between so when you check out the branch your staging area contains a cache of all the files in this branch of the curve the current versions right and when you change the file and do did add what does it compares contents of the file with the contents of the file in staging era the cash this what's called the cash if it sees the differences it most this stage is part of the files into this staging area right and when you do get commit its goal goes over the files that are marked as change in the staging areas just think huge use the files basically a trio of your changes things and then it picks all the things that were changed and then creates a new tree from it and then this tree goes into your comments so basically when you would do to
. which Burgess everything it basically means check out all the files from staging area or working tree and if we do check out branch name with the checkout files from the branch and well say the cash in the staging area OK I hope that the answers to questions like this on the floor great question
fueled with its representations but will
start in on and bluff looking and thanks for the great bulk of should we think that static actually would the same way as staging necessary including give
that that's what they might as well not discussions something like completely on the side right it's just that yet it has really nothing to do with basically the repositories of that just gives special says OK let's save those changes into our you know just special
file and it's it's it's it's it's actually not part of the the sort of the version control system it's and this just the basically adjusts is just like will like copy-paste area or something just more more advanced hasn't really nothing to with the person can versioning at all thank you so you you basically mentioned the branches aren't all objects but if I remember collect correctly banks are objects silicon area comment on that originated in and get into this give this part because I think it's moment that crucial for a for for start yeah indeed uh there are actually 2 types of tags like cold light weight tags and the other 1 so light weight attack is pretty much the same as a branch is the file with the name of the tag plus the coming the points so this is lightweight taken and then there is a well known light weight and remembers action man OK we take it and yeah are not think you annotated right to annotate the tag is 1 more object type and which points to the commit that it's it's it's it's tags plus you can add a comment and went the metadata and that kind of tag is an object and as a result is of course immutable yeah there are 2 types of text have nothing to do in common except that the both recall text thank you have more questions and
thank you and thank you for discussing rewriting the history of you is that some of
them actually you you will come and you might say you ought to have occurred by and large binary file and it's always with you yes thanks to a question it's something that I would not recommend doing at all right because if you're the only right if you reserve your branches so if you're the only user it's OK but if you push this then when other people pull their new muscle has nothing
to do with the old masters right and then you will need to do give a push hard right or forces of the key so yeah it will in a collaborative environment it will create a lot of features so here you should try avoiding all cost about the ways more remarks so basically because there History is immutable as features for the similarities with its acknowledgeable watching this used in cryptocurrencies right that's where the block
is linked to is brand and block in this way if you cannot tamper with all transactions because you can easily verify that you know question doesn't match so that's 1 of the ideas I think 1 of the most intriguing ideas of thank it any more questions this time for like 1 or 2 more the 1 in the front thank you I Anderson Noel master pointers are meant to pointers works but the whole of the dogs when you write like head spilled up to our head of some metadata dictates requires a flow cup in mandatory or something like this yeah yeah had is that and it's a like a special reserve for reference so can as the yeah it is pretty much had this sort of like a branch but not really branch so we can see is just the file but not in the branches folder but just in the root folder and racing ahead is a pointer to a current branch and this what is known by detection has right so uh if I want to you know to
cover specific community which account in the name of the command and then the head is of pointing to some branch will point specifically to this coming so yeah head of sort of like a reserve branch name awful source it operates pretty much in a similar fashion but this is not part of of all branches can yeah things
so thank you very much for this and that and nice and you mentioned something about the matter now what is the subject of internally and how does this compare the you know the three-base answer was compared to what you can speak alienation with the all that's something I'm not gonna test as I can because I can't say I did OK that 1 more hole talk of the difference between the margin rebase so rebates OK outrage to the richest emerges as a command with the 2 parents and Eve gives
the agent automatically tries to merge it and it cannot automatically merge it will say OK sorry about their a common a complex in each result of self so how um everybody's wars tree-based basically replaced replace the changes in other on top of the current branch right so if it did it's tries to see what changes what delta S like logical that's the diffs that were in this branch then we will try to apply the same actions on the current branch uh so yeah it's results the same in basically in the result will have to commit it contains the same information but is just uh but it will not have any information on its list of all the rebates commits have 1 parent right it will result of is the same way in terms of content if they're on a conference of course but in terms of structure it will say all rebates and from his have single parent but yeah that's a really messy in public is of a cat person on the 1 hand like rebasing yes a great influence