We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Decentralized Storage with IPFS

00:00

Formal Metadata

Title
Decentralized Storage with IPFS
Subtitle
How does it work under the hood?
Title of Series
Number of Parts
542
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
The Interplanetary File System (IPFS) is a decentralized file system for building the next generation of the internet. IPFS is a distributed system for storing and accessing files, websites, applications, and data. In this talk, we’ll dive into how decentralised storage with IPFS works under the hood as it builds on top of many long-known, well established techniques and yet is more than just the sum of its parts. We’ll cover basic working principles like content-addressing and content-routing, followed by the results of an extensive measurement campaign. Answering questions like: How does content-addressing work? Who stores my data if I upload something to IPFS? How do you retrieve content if you only know the hash of it? How fast is that process?
Installable File SystemBlock (periodic table)ChainBand matrixContent (media)VideoconferencingDifferent (Kate Ryan album)Point (geometry)MeasurementYouTubeConnected spacePartition (number theory)BitInternetworkingPeer-to-peerImplementationSoftwareInstallation artContent (media)Block (periodic table)Operator (mathematics)RandomizationString (computer science)CASE <Informatik>Arithmetic progressionNumberComplete metric spaceData storage deviceView (database)Address spaceComputer fileAuthenticationPropositional formulaRoutingDirectory serviceUniform resource locatorRepository (publishing)10 (number)Default (computer science)File formatObject (grammar)Web browserMathematicsHash functionLibrary (computing)Greatest elementPublic-key cryptographyMobile appFiber bundleSoftware development kitMathematical optimizationLocal ringSoftware developerCommunications protocolExtension (kinesiology)Computer animation
Content (media)Computer networkDemonSoftwareInformationPoint (geometry)Revision controlRoutingGreatest elementIdentifiabilityData structureContent (media)MereologyDescriptive statisticsMappingTable (information)Hash functionMetadataFundamental theorem of algebraDifferent (Kate Ryan album)Default (computer science)Heegaard splittingComputer fileScaling (geometry)InternetworkingVirtual machineDemonPeer-to-peerRight angleVapor barrierBootstrap aggregatingImage resolutionMobile appBitGateway (telecommunications)Process (computing)Web 2.0Local ringBridging (networking)Communications protocolServer (computing)Distribution (mathematics)1 (number)Address spaceInteractive televisionProjective planeCASE <Informatik>Inheritance (object-oriented programming)Graph (mathematics)File formatDirectory serviceData storage deviceRootCombinational logicNetwork topologySpacetimeStreaming mediaVideoconferencingCategory of beingBlock (periodic table)CuboidComputer animation
Distribution (mathematics)Gamma functionMassBootstrap aggregatingContent (media)RoutingMeta elementOpen setString (computer science)BitExclusive orHash functionPhysical systemCASE <Informatik>Peer-to-peerQuicksortNumberMetric systemClosed setOperator (mathematics)DistanceTable (information)Internet service providerRow (database)Key (cryptography)Address spaceProcess (computing)MappingAdaptive behaviorGroup actionSpacetimeMultiplication signRadiusCartesian coordinate systemProjective planeMeasurementMereologyConnected spaceQuery languageElectronic mailing listShared memorySoftwareLoginCategory of beingRepository (publishing)Uniform resource locatorComputer fileMechanism designRoutingContent (media)Digital electronicsDemonBootstrap aggregatingData storage deviceSlide ruleIP addressOpen sourceGreatest elementComputer animation
MereologyProjective planeCollisionContent (media)Mechanism designSystem identificationRight angleCryptographyHash functionWebsiteSoftwareArithmetic meanSubject indexingSoftware maintenanceBroadcasting (networking)Core dumpInternet service providerRow (database)Computer fileElectronic mailing listElectric generatorPeer-to-peerData storage deviceGateway (telecommunications)Operator (mathematics)Link (knot theory)ExistenceChainPermanentVector spaceTable (information)SpacetimeIdentity managementPoint (geometry)CodeSearch engine (computing)Computer animationDiagram
Program flowchart
Transcript: English(auto-generated)
So great to see you all. So many people here, that's awesome, working with my talk. It's called Decentralized Search with IPFS. Maybe first of all, like a quick poll, so how many of you have used IPFS? Please raise your hand.
Okay, okay, nice. And how many of you have heard about IPFS? Okay, all of you. Okay, cool. So you know all about it already. Yeah, so the talk is called How Does It Work Under the Hood? So we will dive in pretty deep at some points of the talk. But yeah, first things first. My name is Dennis. I'm a research engineer at Protocol Labs.
I'm working in a team called ProbeLab, and we're doing network measurements and protocol optimizations there. I'm also an industrial PT candidate at the University of Gotting, and you can reach me on all these handles on the internet, so if you have any questions, reach out and let me know your questions, or just hear the venue after the talk.
So what's in for you today? First of all, just in words and numbers, what is IPFS? Just general overview. And at that point, after we covered that, I would just assume we have installed a local IPFS node on your computer, and I will walk you through the different commands from, yeah, we are initializing the repository,
we are publishing content to the network, and so on, and we'll explain what happens in each of these steps so that all of you hopefully get a glimpse on what's going on under the hood. So we are importing content. We connect to the network. I explain content routing. This is the very technical part, and at the end, some call-outs, basically. So what is IPFS?
IPFS stands for the Interplanetary File System, and generally, it's a decentralized storage and delivery network which builds on peer-to-peer networking and content-based addressing. So peer-to-peer networking, if you have followed along, or if you've been here earlier today, Max gave a great talk about libp2p, about browser connectivity in general in peer-to-peer networks
and IPFS is one of the main users of the libp2p library and builds on top of that, and most importantly, it's very tiny at the bottom. IPFS is not a blockchain, so also a common misconception. I'd like to emphasize that. In numbers, given these numbers from mid-last year,
so probably in need of an update, but its operation is since 2015. That hasn't changed. Numbers of requests exceed a billion in a week, and hundreds of terabytes of traffic that we see, and tens of millions of active users also weekly, but a disclaimer, this is just from our vantage point. In a decentralized network,
no one has a complete view of what's going on, so these numbers could be much higher or just different in general. On ecosystem.ipfs.tech, you can find some companies that build on top of this tech, and it's all in these different areas, social media, and so on and so forth,
so worth looking up. What's the value proposition of IPFS? The most important thing that it does, it decouples the content from its host, and it does this through a concept that's called content addressing, and content addresses are permanent, verifiable links,
and this allows you to request content with this, or request data with that content address, and anyone can serve you the content, and just from the address that you asked with, you can identify and verify that the content you got served is actually the one that you requested, and you are not dependent on the authenticity of the host, as it's the case with HTTP.
Because it's a decentralized network, it's also censorship resistant, and I like to put here that it alleviates backbone addiction, so what do I mean with that? Let's imagine all of you, or all of us wanted to download a 100 megabyte YouTube video here in this room, we would put pressure, so if we were 100 people, we would put pressure of about 10 gigabytes
onto the backbone to just download the video into this room. Wouldn't it be better if we could just download it once and distribute it across each other, or download different parts, and be a little bit more clever about that? In the similar vein, if we were working on a Google Doc here inside this room, why does it stop working if we don't have internet connection anymore?
It actually should work, it's actually ridiculous. And also, falls into the same category, this partition tolerance for emerging networks could also become very important, or if you're just an Apache coffee shop wifi. All right, so how can you install IPFS?
So there, I put down three different ways here, so IPFS in general is not, you don't install IPFS, IPFS is more a specification, and there are different implementations of this specification, and the most common one is Kubo, which was formerly known as Go-IPFS, so it's a Go implementation. There's a new one called Iro,
which is in Rust, and I think the newest one is in JavaScript called Helia. Yeah, I think that's also the newest kit on the block. And so I will talk about Kubo here, and the easiest thing to get started is just download IPFS desktop, which is an Electron app that bundles an IPFS node, gives you a nice UI, and you can already interact
and request CIDs from the network and so on. Then there's the IPFS companion, which is a browser extension that you can install to Firefox or your browser of choice, or you directly use Brave or Opera, which comes in with a bundled IPFS node already, so if you enter a IPFS colon slash slash and a CID, it will resolve the content through the IPFS network.
But as I said in the beginning, in this talk we will focus on the command line because we're in a developer conference, and I will also assume that we run Kubo, which is the reference implementation, basically. So now we have downloaded Kubo from github.com slash ipfs slash kubo,
and we want to import some content. We just want to get started. So we downloaded it, and now we have this ipfs command on our machine, and the first thing that we do is run ipfs init, and what this does is it generates a public pyrite key pair per default in ed25519, and it spits out this random string of characters,
which is basically your public key. So formally it was just the hash of your public key, but now it's just encoded your public key in here, and this is your peer identity, which will become important later on. Then it also initializes your ipfs repository
per default in your home directory under .ipfs. This is the location where it stores all the files. So if you interact with the ipfs network and request files, it stores it in this directory in a specific format similar to Git, how Git does, the Git's object store, basically. And importantly, I will point this out a couple of times, this is just a local operation.
So we haven't interacted with the network at all yet. So now we are ready to go. I have a file I want to add. So what I do is I run ipfs add and then my file name. And in this case, ipfs gives you a progress bar, or a kubo gives you a progress bar, and spits out again a random string of characters,
which is the content identifier, the CID, which is the most fundamental ingredient here, and this is the part where it decouples the host, sorry, the content from its host. And as a mental model, you can think about the CID as a hash with some metadata. It's self-describing. So the metadata is this description part.
You can see the ingredients at the bottom. So it's just an encoded version of some information, like a CID version, so we have version zero and one. And some other information that I won't go into right now. Then it's self-certifying. This is the point where if you request some data from the network, you certify the data
that you could serve with the CID itself and not with the host that served you the content, just reiterating this. And it's an immutable identifier. And all this structure, like the CID structure at the bottom and so on, is governed by a project that's called multi-formats,
and it's also one of protocol apps' projects here. The talk is called What Happens Under the Hood? So what actually happened here? ipfs saw the file, which is just this white box here, a stream of bytes, and ipfs chunked it up. It's in different pieces, which is a common technique
in networking, actually. And this gives us some nice properties. It allows us to do piecewise transfers, so we can request blocks from different hosts, actually. And it allows for deduplication. So if we have two blocks that are basically the same bytes, we can deduplicate that
and save some storage space underneath. And also, if the file was a video file, we also allow for random access, so we could start in the middle of a video and don't need to stream all the previous bytes at all. And after we have chunked that up, what we do now, or what ipfs does now,
is we need to put it together again. And what we do here is we hash each individual chunk. Each chunk gets its own CID, its own content identifier. Then the combination of each CID, again, gets another CID. And we do this for both pairs at the bottom.
And then the resulting common CIDs, again, will be put together yet again to generate the root CID, that's how we call it. And this is actually the CID that you see in the command line up there. So we took the chunks, put the identifiers together to arrive at the final CID at the top.
And this data structure is actually called a Merkle tree. But in ipfs land, it's actually a Merkle DAG because in Merkle trees, your nodes are not allowed to have common parents. And a DAG means here a directed acyclic graph. And let's imagine you didn't add a file, but a directory. How do you encode the directory structure
and not only the bytes and so on? All of these formatting and serialization, deserialization things are governed by yet another project, it's called IPLD, which stands for Interplanetary Linked Data. And IPLD does also a lot of more things, but for now, this is specified in the scope of this project.
So now we have imported the content. We have chunked it up, we've got the CID, but again, we haven't interacted with the network yet. So people think if you add something to IPFS, you upload it somewhere and someone else takes care of hosting it for you for free, which is not the case.
So we added it to our local node. So now it ended up in this IPFS repository somewhere on our local machine. But only now we connect to the network and interact with it. For that, we run IPFS daemon, which is a long running process that connects to nodes in the network.
We see some versioning information with which Go version was compiled, which Kubo version we actually use. We see the addresses that the Kubo node listens on, and also which ones are announced to the network under which network addresses we are reachable. And then tells us that it started an API server, a web UI in the gateway.
The API server is just an RPC API that is used by the command line to control the IPFS node. The web UI is the thing that you saw previously when you saw the screenshot of the IPFS desktop. So your local Kubo node also serves this web UI. And then the gateway. And the gateway is quite interesting.
So this bridges the HTTP world with the IPFS world. So you can ask under this endpoint that you can see down there, if you put slash IPFS slash your CID inside the browser or like in your asset URL, the Kubo node will go ahead and resolve the CID in the network and serve it to you over HTTP.
So this is like a bridge between both worlds. And protocol apps and Cloudflare and so on are actually running these gateways on the internet right now, which you can use just with a low barrier entry to the whole thing. And then the daemon's ready. And in this process, it has also connected to bootstrap nodes,
which are hard coded to actually get to know other peers in the network. But you can also override it with your own bootstrap nodes. And so now we are connected to the network. We've added our file to our own machine. But now the interesting or the problem or like the challenge, how do we actually find content hosts for a given CID?
So I give my friend a CID. How does the node know that it needs to connect to me to request the content actually? And I put here the solution is simple. We keep a mapping table. So we just have the CID mapped to the actual peer. And every node has this on their machine. So everyone knows everything basically.
But as I said, the mapping table gets humongous, especially if we've split up those files into different chunks. And I think the default chunking size is 256 kilobytes. So we have just a lot of entries in this table. So this doesn't scale. So the solution would be to split this table and each participating peer in this decentralized network
holds a separate part of the table. But then we are back to square one. How do we know which peer holds which piece of this distributed hash table data? And the solution here would be to use a deterministic distribution based on the Cardemnia DHT. Cardemnia is a implementation or like a specific protocol
for a distributed hash table. And at this point I thought, so at this point many talks on the internet about IPFS gloss over the DHT and how it works. And so when I got into this whole thing, I was lacking something. And so my experiment would be to just dive
even a little deeper into this. And I will cover a bit of Cardemnia here. But at the end, this is very technical, but at the end I will try to summarize everything so that every one of you gets a little bit out of this. This whole process is called content routing. So this resolution of a CID to the content host.
And IPFS uses an adaptation of the Cardemnia DHT by using a 256 bit key space. So we are hashing the CID and the PRID yet again with SHA-256 to arrive in a common key space. And the distributed hash table in IPFS
is just a distributed system that maps these keys to values. And the most important records here are provider records which map a CID to a PRID. Remember the PRID is that what was generated when we initialized our node. And PR records which then map the PRID to actually network addresses, like IP addresses and ports.
So looking up a host for a CID is actually a two step process. First we need to resolve the CID to a PRID and then the PRID to their network addresses and then we can connect to each other. And the distributed hash table here has two key features. First an XOR distance metric.
So that means we have some notion of closeness. So what this XOR thing does, so if I XOR two numbers together the resulting number of this operation satisfies the requirements for a metric. So this means I can say a certain PRID is closer to a CID than some other PRID.
So in this case PRID X could be closer to CID one than PRID Y. And this allows us to basically sort CIDs with PRIDs together. And then this tree-based routing mechanism here. So in this bottom right diagram,
I got this from the original paper, we are the black node. And with this tree-based routing this is super clever as in each bubble, so all the peers in the network can actually be considered as in a big trie, a prefix trie. And if we know only one peer in each of these bubbles, we can guarantee that we can reach
any other peer in the network with O log N lookups by asking for even closer peers based on this XOR routing mechanism here. So this was just abstractly what the distributed hash in IPFS does. So how does it work concretely for IPFS? So we started the daemon process. What happened under the hood was
we calculated the SHA256 of our PRID which just gives us a long string of bits and bytes or just bits basically in our case. And we initialized the routing table at the bottom. And this routing table consists of different buckets. And each bucket is filled with peers that have a common prefix to our PRID,
the hash from our PRID at the top. And when our node started up, we asked the bootstrap peers, hey, do you know anyone whose SHA256 from PRD starts with a one? And this means we have no common prefix and we put those peers in bucket zero.
Then we do the same for a prefix of zero zero and zero one one. And so we go through all the list until 255 and we fill up these buckets. And these buckets are these little blobs and these little circuits that you saw on the previous slide. And why did we do that?
Because when we now want to retrieve content, so as I said, I handed the CID to my friend and my friend enters the CID in the command line with this ipfs get command. Their node also calculates the SHA256 of the CID and then looks in its own routing table, sees, okay, I have a prefix of two.
I take one peer out of this bucket two and ask, yeah, locate the appropriate bucket, get the list of all peers. And then I asked all of these peers in the bucket, hey, do you know anyone? So first of all, do you know the provider record already? Do you know the CID and the peer ID to that CID? And if yes, we are done. But if not, we're asking, do you know anyone closer
based on this XOR metric? And then this peer yet again looks in its own routing table. And so we get closer and closer and closer with this log N property that I saw, showed you previously. Okay. And for publishing counting, it's basically the same. We calculate the SHA256 of the CID,
locate the appropriate bucket, get a list of all the peers from that, and then we start parallel queries. But instead of asking for the provider record, we ask for even closer peers. And we terminate when the closest known peers in the query actually haven't replied with any peer that's closer.
Hasn't replied with anyone closer to the CID than we already know. And then we store the provider record with the 20 closest peers to that CID. And we do it with 20 because there's peer churn. So this is a permissionless network.
And this means peers can come and go as they wish. And if we only stored it with one peer, we would risk that the provider record is not reachable when the node comes down. In turn, all content is not reachable. So this is like the very technical part of that. But let me summarize this.
Maybe this is probably the easier way to understand all of this. First of all, so we added the content to our node. And so this is the file, enters the provider. The provider looks in its routing table, gets redirected to a peer that is closer to the CID and gets redirected until it finds the closest peer
in this XOR key space metric to the CID. And then it stores the provider record with that. Then off-band, the CID gets handed to the requester, to my friend. And what I didn't say or told you yet, it's also, IPFS maintains a long list or like, I don't know how many it is right now,
probably 100 or so constant connections to other peers. And opportunistically, just ask them, hey, do you know this CID or the provider record to the CID? And if this resolves, all good, we are done. But it's very unlikely for people to actually know a random CID.
So let's assume this didn't work. So this requester also looks in its own routing table, gets redirected, gets redirected even closer, even closer to the peer ID of that CID. And then finds the peer that stores the provider record, fetches the provider record, then does again
the same hops to find out the mapping from the peer ID to the network addresses. And then we can actually connect with each other and transfer the content and we're done. So this is the content lifecycle. And this is actually basically, this is already it.
Well, already it is quite a bit, quite involved actually. And yeah, with that, it's already time for some call outs. Get involved. IPFS is an open source project. If you're into measurements and so on, we have some grants open at radius.space.
If you want to get involved with some network measurements, get your applications in, all action is in public. You can follow along our work, especially my work and of our team at this GitHub repository. We have plenty of requests for measurements that you can dive into.
And extra ideas are always welcome. In general, IPFS is, I think, a very welcoming community, at least for me. And yeah, just, that's it. Thank you very much.
So, okay. Any questions? So is the way you describe using the DHT how all nodes in the network share files with each other? It's one content routing mechanism. So there are multiple ones.
So content, it's a bit, so this first thing that I've said here, so this opportunistic request to your immediate nodes is also some kind of content routing. So you're resolving the location of content. Then there are some new efforts for building network indexers, which are just huge nodes that store the mappings, centralized nodes, which, like federated centralized nodes,
so not as bad. And I think, yeah, I think these are the important ones, basically. So there are more ways to resolve content. Also, MDNS could also be one part. So if you are on the same network, you're broadcasting.
I know, that's just for, sorry? For the local, yeah, okay, true. Luckily, we have a core maintainer of IPFS here, yeah. It's actually not a joke, but yeah. Sorry. Ah, sorry, yeah. So I see that the provider records get replicated,
but does the content actually get replicated across the network too? No, so only if you, yeah, only if someone else chooses to. So you're publishing the provider record, so it's public somewhere, and anyone could look that up, and also store the record themselves. So this is the idea. If content is popular,
and you care about the content being, staying alive in the network, you, it's called PIN, the CID, and this means you're fetching the content from this other provider, and store it yourself, and become a provider yourself. And because of the CID mechanism, which is self-certifying, and so on,
other peers that request the content from you don't even need to trust you, because the CID already encodes the trust chain here. But there's nothing that happens, it's not happening automatically here, so. But you can have multiple providers for the same content? Definitely, yeah, yeah, yeah, that's also, yeah, yeah, definitely. That's part of it. Another question is like how does it,
like how does the project fit in the concept of identity, and trust, and personas, into IPFS? I'm thinking like metadata, reifications about the content, and stuff like that. What do you mean exactly? Like, for instance, just a history of the content, and like, can you trust that this content
is like from a certain person, or from a certain, you know, like. I would argue this would probably be just some mechanism on top of these content identification. So this is more for IPLD then, or for? Perhaps, I would say, so if you want to say some content is from some specific person to,
then you would work with signatures, so signing the data, and so on, which is something you would bolt on top of IPFS. But nothing, I think, IPLD has encoded there right now. Right on. It's partly the same question, but how it is ensured that
there are no collisions in the content ID? No collisions? Yes, because if you publish some other content with the same content ID, you said it's happening locally, the content ID generation. You could fake contents.
Yeah, okay, but then, okay, but then all the, like, these cryptographic hash function would be broken then, which would be very bad. And if you have a hash collision, then it actually means you have the same content. That's the assumption right now. Or maybe, yeah, Joe. We just use a SHA-256 by default, and you can use also one, like Blake-3, Blake-2,
but if you find a collision in SHA-256, you have bigger problems, and IPFS not working. Yeah, exactly, that's, yeah. Follow on on this. How resilient is this against malicious actors that want to prevent me from reaching the content?
It's a big question, but maybe something. Yeah, so on P2P networks, often these kind of Sybil attacks are an attack vector that is considered, which means you generate a lot of identities to populate just some part of the key space to block some requests
from reaching the final destination and so on. And so what we, so from my experience, this is quite hard, and I haven't seen this happening. I cannot say that it's impossible, or probably hard to tell.
Max, do you want? Yeah, Cardemlya has this mechanism where only long-living peers stay in the routing table. True, yeah, only, yeah. So this Sybil thing is just one attack vector, but this is like the common one that is considered. So there are many points in the code base
where you need to think about, okay, what happens if a Sybil attack is going on? And one thing that Cardemlya does is to keep, like prefer long-running nodes, stable nodes in the routing table. So if someone immediately generates a lot of identities that they don't end up in your routing table and pollutes your routing,
your content routing here, or interferes with that. All right, go ahead. Not sure if I want to ask it, but removing content, deleting, you know, we got GDPR, so. Is there any solution that can be done?
So yeah, it's hard. That's part of the thing. If you could, then it's not censorship resistant anymore. And so what, it's a one solution, well, one alleviation maybe, is to have blacklists of CID that you may publish or may not
to say, okay, don't replicate the CID and so on. But this also, if you have such a list, then it's very easy to just look it up and see what's inside. Yeah, so deleting content is very tricky. However, I said it's permanent links. Yeah, the links are permanent,
but actually content still turns in the IPFS network and these provider records that you publish into the network expire after 24 hours. So if no one actually re-provides the content or keeps the content, the content is gone as well. But a delete operation doesn't exist.
So we would just need to hope that no one will be provided anymore, which you could do with these deny lists, for example. Yeah, Daniel, okay. Who is able to write into that blacklist and is there any?
Yeah, this is just one. I don't know, to be completely honest. But this is just one, oh, maybe Duropo knows, yeah. There is no blacklist in the network right now. It's some people, it's a few people that want that. But we have, sorry, earlier you said that we have gateways, and gateways is just a node
that publicly is reachable. And those gateway, because many people see that, okay, they find some content illegal on IPFS, and instead of reporting to the actual node, storing the content on IPFS, they just report it on the gateway because they know HTTP and they don't know IPFS. And so our gateway has some blacklist that is somewhere,
but it's not shared by the complete network. It's just for our gateway, ipfs.io. So what I observe, so Cloudflare, for example, and Procolab are writing these gateways, or more, and anyone could operate a gateway, and so you could file a request for this, don't replicate the CID, it's a phishing website,
for example, and then these CIDs are not served through the gateways, which is a common way to interact with the network right now. You're just the gateway that follows the list. It's not automatic. Yeah, of course. Yeah, of course, yeah, yeah. Okay, we're running out of time, unless there is one more. Okay.
I have a question regarding searching through the stored content. Is there any mechanism on how to go through or index the files that are there to have some sort of like a search engine for that? Right, so there's a project called IPFS Search,
and this makes use, among other things, of this immediate request for CIDs, so it's just sitting there, connecting to a lot of nodes, and as I said, if someone requests content, you immediately ask your connected peers, and you're connected to a lot of peers, and these IPFS search nodes are just sitting there
listening to these requests, and they see, okay, someone wants this CID, so I go ahead and request that CID as well, and then index that content on myself, and so you can then search on this IPFS search website for something, just with Google, and then you see CIDs popping in, and then you can request those CIDs
from the IPFS network, so this is one approach to do that, to index content, yeah. Okay, thank you. Okay, thank you. Thank you.