We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Tips and Tricks to Scale Elasticsearch for 1M RPMs and Beyond

00:00

Formale Metadaten

Titel
Tips and Tricks to Scale Elasticsearch for 1M RPMs and Beyond
Serientitel
Anzahl der Teile
69
Autor
Mitwirkende
Lizenz
CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
When throwing more hardware into your Elasticsearch cluster doesn't help and your management tells you that this year the search traffic in our eCommerce site will triple. In this talk, I will present several not-so-obvious Elasticsearch performance issues and provide some proven recipes on how to prevent/fix them. Also, I'll share bits of the infrastructure of our search stack that empowered the team to experiment fast without impacting the production and scale Elasticsearch to handle 1M+ RPMs.
StabAuswahlaxiomProdukt <Mathematik>PerspektiveWeb SiteMultiplikationsoperatorHilfesystemEinsDatenverwaltungÄhnlichkeitsgeometrieAbfrageBefehlsprozessorElastische DeformationAutomatische IndexierungVersionsverwaltungStabGlobale OptimierungAuswahlaxiomComputerspielDokumentenserverFormale SpracheStrategisches SpielZweiCoxeter-GruppeBitTwitter <Softwareplattform>AutorisierungStapeldateiLoginNichtlinearer OperatorServerFunktionalSoftwareentwicklerSystemplattformLastCASE <Informatik>Front-End <Software>Computerunterstützte ÜbersetzungWiederkehrender ZustandXMLUMLComputeranimation
Elastische DeformationAbfrageExogene VariableInformationsspeicherungBildgebendes VerfahrenVisualisierungExogene VariableElastische DeformationAbfrageNotebook-ComputerEindringerkennungComputeranimationXML
EindringerkennungDigitalfilterTabelleKartesische KoordinatenDatenbankGlobale OptimierungProgrammierungAttributierte GrammatikKontextbezogenes SystemGanze ZahlRechenschieberAbfrageSchlüsselverwaltungEindringerkennungDatentypInformationsspeicherungWeb logRuby on RailsComputeranimation
TopologieGanze Zahlp-BlockMathematikZeichenketteTermAbfrageTypentheorieSpannweite <Stochastik>GleitkommarechnungFilter <Stochastik>Web SiteEvoluteWeb logSpannweite <Stochastik>KardinalzahlDatentypEindringerkennungGanze ZahlMathematikAutomatische IndexierungAbfrageDatenbankTermGlobale OptimierungSoftwareentwicklerRegulärer GraphFront-End <Software>MultiplikationsoperatorRADAR <Automatisierungssystem>DatenstrukturDatenfeldMapping <Computergraphik>p-BlockCASE <Informatik>ZeichenketteRechenschieberFunktionalPunktVersionsverwaltungGraphiktablettHochdruckSichtenkonzeptDemoszene <Programmierung>BeanspruchungComputeranimation
MathematikSchreiben <Datenverarbeitung>DigitalfilterPunktAbfrageMathematikSpannweite <Stochastik>Schreiben <Datenverarbeitung>Produkt <Mathematik>ZeitstempelBeanspruchungCachingProzess <Informatik>ComputerspielArithmetisches MittelSoftwareentwicklerDatentypMultiplikationsoperatorBefehlsprozessorCASE <Informatik>OrdnungsreduktionZeichenketteSpezifisches VolumenArithmetischer AusdruckBitAutomatische IndexierungPerspektiveGüte der AnpassungElastische DeformationMatchingQuadratzahlFilter <Stochastik>SichtenkonzeptNatürliche SpracheSummierbarkeitXMLComputeranimation
SichtenkonzeptPunktStrategisches SpielIndexberechnungBitMereologieCachingMAPHeegaard-ZerlegungResultanteOnline-KatalogFunktionalTypentheorieBeanspruchungTeilbarkeitMultiplikationArithmetisches MittelMapping <Computergraphik>Shape <Informatik>RoutingMultiplikationsoperatorAutomatische IndexierungMathematikAbfrageElastische DeformationLoginZahlenbereichAttributierte GrammatikGlobale OptimierungMetrisches SystemFokalpunktLogiksyntheseCASE <Informatik>KanalkapazitätKomplex <Algebra>BitrateSchnittmengePerspektiveDifferenteDatenfeldKontextbezogenes SystemInstallation <Informatik>XMLUMLComputeranimation
IndexberechnungTypentheorieElastische DeformationInformationsspeicherungKontextbezogenes SystemBeanspruchungHeegaard-ZerlegungBeschreibungskomplexitätResultanteSichtenkonzeptCachingHeegaard-ZerlegungAutomatische IndexierungAbfrageBeanspruchungKomplex <Algebra>BitrateTypentheorieStrategisches SpielRoutingKontextbezogenes SystemIndexberechnungPunktMultiplikationsoperatorMultiplikationMapping <Computergraphik>FunktionalLoginZahlenbereichCASE <Informatik>MathematikArithmetisches MittelGlobale OptimierungInstallation <Informatik>Metrisches SystemDifferentePhysikalischer EffektKanalkapazitätPerspektiveComputeranimation
BefehlsprozessorKnotenmengeÄhnlichkeitsgeometrieSoftwaretestAbfrageMereologieCoxeter-GruppeFunktionalZählenPhysikalisches SystemSoftwaretestMultiplikationsoperatorBeanspruchungNichtlinearer OperatorKomplex <Algebra>KanalkapazitätTermSchnittmengeAdditionAbfrageVersionsverwaltungIndexberechnungBefehlsprozessorDefaultArithmetisches MittelVirtuelle MaschineEndliche ModelltheorieServerFormation <Mathematik>Monster-GruppeElastische DeformationPerfekte GruppeDatenverwaltungPerspektiveZahlenbereichXMLUMLComputeranimation
MultiplikationsoperatorElastische DeformationFehlermeldungProgrammfehlerCoxeter-GruppeXMLUMLComputeranimation
Güte der AnpassungMultiplikationsoperatorDatenstrukturOnline-KatalogArithmetisches MittelRichtungPunktHeegaard-ZerlegungRegulärer GraphTypentheorieAutomatische IndexierungMAPCoxeter-GruppeBeanspruchungMultiplikationZählenApp <Programm>AbfrageTermIndexberechnungOverhead <Kommunikationstechnik>Inverser LimesTotal <Mathematik>ThumbnailSchlussregelStrategisches SpielKanalkapazitätMetrisches SystemZahlenbereichRoutingProdukt <Mathematik>Elastische DeformationAdditionSpezifisches VolumenVerzweigendes ProgrammFormale SprachePhysikalischer EffektSchaltnetzVersionsverwaltungGruppenoperationsinc-FunktionZusammengesetzte VerteilungRechter WinkelCachingVollständigkeitMatchingBesprechung/InterviewXMLUML
Transkript: Englisch(automatisch erzeugt)
So hello everyone and welcome to my talk which is called Tips and Tricks to scale elastic search for 1 million requests per minute and beyond. First I want to thank Vinted for giving me the opportunity on working on interesting technical challenges.
And I want to thank Better than Buzzwords for accepting my talk. So who am I? My name is Denis Yotsas. I work for a company called Vinted whose mission is to make the second hand the first choice worldwide. I'm a staff engineer there that mostly overlooks the search features.
I do have some online presence. I have my personal website. I post something on Twitter. And I'm the author of a little search tool called Lucimrep. Check GitHub for that. So the agenda for this presentation is that I'll do a little introduction. And then I'll talk about the scale of the elastic search installation at Vinted a bit more than a year ago.
Then I'll talk about three optimization strategies. And then I'll briefly describe what was the scale a year later. And then at the end we can have a little discussion.
So first Vinted and Elasticsearch. So as I mentioned before Vinted is a second hand clothing marketplace. Vinted operates in more than 10 countries. And it has to support more than 10 languages because we mostly operate in Europe. And Europe is rich for various languages.
First installation of Elasticsearch at Vinted was done in 2014. And the first version was 1.4.1. I checked this in our Chef repository. And today I'm going to share with you a couple of tips and tricks. How we evolved our installation there.
So let's begin with a scale as of January 2020. So at that time Elasticsearch was version 5.8.6. Which at that time was more or less at the end of life.
This version was a little bit old. Because the upgrade that was tried some time ago to a newer Elasticsearch version failed miserably. That caused some downtime. And it was kind of a hot topic there. So the Elasticsearch cluster was really big.
We had a one main cluster that had around 420 data nodes. And each data node had 14 CPUs and 64 GB of RAM. All the installation was done on bare metal. At that time the search throughput was around 300 requests per minute during the peak hours.
During the peak hours, I mean during the evenings when people are browsing the website the most. At that time we had about 160 million documents in our primary index. And that index was divided into 4 shards that had 4 replicas. And the 99th percentile of latency at that time during the peak hours was floating around 200 to 250 milliseconds.
And the cluster had some performance issues. And the slow log, meaning the slow queries. And slow queries we consider as the ones that has latency of more than 100 milliseconds.
During the peak hours it just skyrocketed. So we had some performance issues. From the company perspective Elasticsearch installation was seen as a business risk. Because it was not coping with the load.
And most of the backend developers didn't want to touch it. Because it was complicated and brittle. However, the management expected that the usage of the implemented platform is going to at least double by October of the same year. Meaning that there will be similar increase in usage of Elasticsearch.
And adding more servers to this one big Elasticsearch cluster was considered as a bad idea. And on top of that, the functionality on Elasticsearch just accumulated over the years. Remember, 2014. And it was pretty much no clear oversight and no clear ownership at that time.
And at that time, site reliability engineers were responsible for keeping Elasticsearch operational. And I can give you a hint that server restarts doesn't help all that much when Elasticsearch cluster is overloaded. So yeah, that was the situation in 2020.
So just to give you a brief hint on how the performance looked like. When I joined Vinted I started to run a little performance experiment. Which were all more or less just take some queries, replay them back into production and just see how it goes.
And when I replayed a small batch of 10,000 queries during our easy Tuesdays. Tuesdays are the easiest days at Vinted. Those 10,000 queries caused additional 4 million slow queries ending up in the slow query log. So that should give you some fear if you see such numbers.
So the performance was bad. The main way how we measure the performance experiments is by replaying the queries from the query log or the slow log. So in that case with our little tool called CAT, which stands for Kafka Elasticsearch tool.
We take the queries from the slow log, replay them into some target Elasticsearch cluster. And we store the responses in yet another Elasticsearch cluster for visualizations. As you can see from the image, sometimes the replay involves three Elasticsearch clusters.
But also it can be easily done just on your laptop. So yeah, this is replay. Yeah, so let's go to our experiments and lessons learned. So one of the easiest tricks to do is to store your IDs as keywords in Elasticsearch. So when I'm talking about the keywords as IDs, I mean if you have your queries, your simple Boolean queries that has filters.
And you're filtering your documents on some attribute that has some IDs. For example, like here for a country ID that has countries 2 and 4, you can do this optimization.
So a little context here. Elasticsearch indices data from MySQL, which is primary data store. If you want more details on that, you can check vintage engineering blog for details. Another thing is that a common practice for Ruby on Rails applications, which went back in this,
is to create databases, database tables with primary keys as auto increment integers. And a little quiz for all of you. Which Elasticsearch data type would you use to store those integers from MySQL database? If your answer is integer, because why not?
I hope that during the next couple of slides I will show you that this answer is not always the correct one. So yeah, in 2016 there was a blog post on Elasticsearch website written by Michael McCandless about the evolution of numeric range filters in Apache Lucene.
So TLDR version of this blog post says that before this change in Lucene, integers were indexed as padded string terms, just regular strings. But after the change, the integers were indexed as block ID trees, so the data structure was changed.
And the change in Lucene propagated into Elasticsearch from version 5.0. So the outcome was that numeric data types were optimized for range queries. However, numeric data types continued to support terms queries, so no functionality was broken there.
However, from the vintage points of view, this change was that we use IDs, we don't use IDs for range queries, we use IDs for simple filters with terms queries,
as we've seen in the example in the previous slide. And this optimized integer data type for vintage use case meant the degraded query latency. By how much? For our workload, when we did the replay, we've seen the decrease of 99th percentile of around 20 milliseconds,
which was a significant improvement to our latency. And since we are using up to 10th of such fields with every search query, this meant a huge performance improvement for us.
And the cool thing is that the required change was as simple as changing the index mappings and just re-indexing all the data into Elasticsearch cluster, and no query change was required. So, easy money.
And to summarize this first lesson is that remember that vintage started to use Elasticsearch at the time when it was okay to index integers as, IDs as integers.
But after the upgrade to 5.0, IDs as integers silently became a performance issue. And this change, that break nothing, easily slipped under the radars of regular backend developers, and then sometime later backfired badly.
Well, one of the outcomes was that Elasticsearch, the developers thought that Elasticsearch is just a slow database. So, I highly recommend to try this optimization on your workloads and let me know how it goes. Yeah, so we are done with the first lesson, then we can talk about filtering on dates.
Yeah, so for example, let's first talk about the date math. By date math, I mean queries where you have clauses such as now minus seven days. So, from the developer's point of view, writing such a filter on date is just as simple as hardcoding a string
at now minus seven d, and that's it, you're done with the feature, you can proceed with your life. However, if most of the queries that you're issuing, requesting into Elasticsearch has this date math,
the more queries you send to your cluster, the more CPU it requires to handle them all. Because remember that queries with date math cannot be cached,
means that you are not leveraging Elasticsearch cache, and the cache is the primary reason why Elasticsearch can be so fast when doing its job. So, don't use date math in your filters, and always write explicit timestamps in production workloads.
Of course, I know that in Kibana, when you're just playing with your data, writing date math expressions is perfectly fine. So, let's talk a bit more about date filtering, and for example, let's talk about this innocently looking filter,
where you just query a range of documents that were created sometime before this timestamp, which is the timestamp of today. So, basically, to translate this query into human language, this query asks Elasticsearch to collect all the documents that are not newer than X timestamp.
However, remember that Vinted is running for more than ten years already, and starting from now to the beginning of Vinted document collection, let's say,
you have a volume of ten years of documents, and this filter might match around 99% of all your documents in your index, which is not a good filter. So, what we can do about it? What if we rewrite this range filter into an inverted clause,
where we are saying that don't take documents that are created at sometime later than a timestamp, the same timestamp from the previous example. So, once again, this Elasticsearch query asks to collect documents that are not newer than X timestamp,
and once again, remember that what if the timestamp of now and all my documents are accumulated over the last ten years, then with this rewritten filter, the rewritten filter would match only 1% of all the documents in your index,
which I would say is a good filter, because from Elasticsearch perspective, a more specific filter is a good filter. And a bonus tip from Elasticsearch docs is that when you are filtering something on timestamp, you should round the date to the minutes or to hours or to days.
It depends on your use case. So, back to the date filtering example, when we tried to replay the inverted clause back into our production clusters, we've seen the reduction of latency at 99th percentile by around 15%,
which at that time translated into additional 10 millisecond decrease from the query latency. And when we deployed this change into production, we've seen a latency chart such as that one,
and as you see, the latency goes on and on, and then the deployment, and then the decrease in latency. When you see a chart like this, immediately you have like, yes, it said we achieved something nice today.
So, a little summary on filtering on dates. So, remember, don't use date math in production because it invalidates caches. Another lesson is that write your filter clauses on timestamps, or any other data types for that matter, in the way that it matches fewer documents.
So, we did before two lessons about how to get quick wins from the Elasticsearch performance point of view, and then let's proceed with the third lesson, which plays a bit more on how to scale Elasticsearch for 1 million and beyond, on this beyond part.
So, one of the strategies that we applied is that we create feature-focused indices. So, a little context on those feature-focused indices.
So, most of the Elasticsearch installations start in the similar fashion. So, you have your collection of documents, it might be large, it might be growing fast. You usually store this collection in a one index, potentially with many charts, of course, and then the functionality, meaning the multiple query types on that index just accumulates over time,
you just add new fields, and so on. And it just happens that the same index starts to handle workloads such as search, aggregations, counts, and so on.
Also, it happens that when your search traffic increases, with such a setup, your latency also grows. So, when we were working on scaling Elasticsearch, we considered splitting the workload into many clusters, but we decided not to do it because of operational complexity that it would bring.
Remember that we had one big Elasticsearch cluster, and Elasticsearch cluster should be, well, you know, elastic. So, it should accommodate the increased workloads. So, we had another idea how to proceed from the situation described previously.
So, yes, we still have this one huge cluster, but what if we would index the same data multiple times, meaning that we just index the same data into an index, into two different indices, basically, and then route some query traffic of some type into that index, see how it goes,
if we can get something out of such a setup, and if we see it as a promising one, we can optimize this splitted workload later and separately. So, this should remind you of some divide-and-conquer strategy.
So, for example, what we've done while working in this way. So, we have a query type that handles workloads for, say, give me the newest items,
newest things in our catalog, grouped by a favorite brand of a user that were uploaded over the last week. So, immediately that should somehow ring you a bell that it should be Elasticsearch,
the basic functionality of top hits aggregation. So, the data shape, meaning the data mappings in Elasticsearch cluster, is exactly the same as with the original, the base index, because it was handling such a workload before.
So, it was easy to split, and one additional thing that can be inferred from the requirement for such a workload is that only recent data is really needed. This aggregation happens only over the last week.
So, how it went. Let's say you can evaluate such a change in multiple ways, but, for example, let's take the Elasticsearch request cache point of view. So, the request cache is a shard-level request cache, shard-level that caches the results locally on each shard.
So, basically, when a shard that handles the search request sees a query that it just handled a second ago, it says, okay, I can deliver the results straight from the cache without doing all the complicated work.
So, the request cache is useful for frequently used search requests, for example, for aggregation queries. This is exactly the workload that we are trying to work with. So, when we did the split and we measured the request cache hit rate for this new specialized index that did the aggregations,
the request cache hit ratio was somewhere around 42%. While the request cache of an index that was handling many workloads there was only 6%, meaning that it increased somewhat seven times,
which was a nice win. Of course, two different workloads, two different query sets, and so on, but still, we have a one very feature-focused index that handles the workload easily and doesn't cause many troubles from the cluster-wide perspective.
So, two different workloads, and they are not interfering with each other, meaning that we have big clusters and we have enough data nodes to accommodate shards for each two indices separately.
So, that's one. As of now, we have five such splits of query types, and we are planning to do even more in the future. So, a little summary on this lesson learned about feature-focused indices.
So, if you have a large Elasticsearch cluster that seems somewhat underutilized, meaning it has enough capacity to do such tricks as index the same data multiple times, I would advise you to proceed with such a strategy,
and this strategy could be generalized like divide-and-conquer by splitting the query traffic by use case and then optimize this identified use case later on separately. So, there are good things about such a strategy.
First thing is that it's easy to prioritize. So, you can open up your query log if you have one, look at the query types that you have, and then by the sheer number of queries of a particular type,
you can prioritize. The more queries, the more it makes sense to split it and optimize it and work on it. And another cool thing is that when you split your index in such a way, and your workload in such a way,
is that it's easier to measure the optimizations because the performance metric that Elasticsearch gives you are not polluted by unrelated queries to your index. However, there's also downsides with such a strategy
and the downside is that it requires more work, meaning that when you have a change in your index mapping that you touch a common attribute, it means that all the indices that were split from this base index
should be and must be re-indexed more or less at the same time to accommodate the new change. So, it requires more work. But overall, the results are promising and the engineers that are on duty are happy about such a strategy
because it makes Elasticsearch operate as smooth as a Rolls Royce. Yeah, so congratulations, we are done with the lessons and then let's proceed to the last part of our presentation, which is let's talk about how Elasticsearch installation looks like as of now.
So, as of now we have Elasticsearch of version 7.9.3. So, we managed to do an upgrade from version 5 to version 7 and all that was required. Now, instead of having one huge node,
a monstrous node, I would say we have three clusters that are handling the same workload, which all the three clusters are still huge. Each of the clusters has around 160 data nodes and each data node has 16 dedicated CPUs
with 48 gigabytes of RAM and the installation is still on bare metal. And on top of that, we have three clusters, three big clusters. We have a fourth cluster, which is of similar size that we are using for offline experimentation.
And if needed, we can use that offline cluster for additional capacity or performance testing and so on. And as promised by management, the usage grew a lot and during the peak hours,
we have around 1 million or 1,000 requests per minute. And also, our data set grew significantly. Now, instead of having 160 million, we have 360 million documents that are being searched with every query query.
And the P99 latency during the peak hours flows around 120, 150 milliseconds. And as for the timeouts, and for the timeouts, we consider requests that take more than 500 milliseconds are negligible.
It's just 0.03% of all the queries that we are having. Yeah, so these are the numbers. And from the organizational perspective, now we have a team that is eight people strong and is responsible for making Elasticsearch operational.
So because we learned from our past mistakes that you cannot just expect Elasticsearch to handle more workload. We do capacity testing constantly, and we test for two times increase in query traffic
in terms of both document count in the indices that we have to search in and the query throughput. And from our engineering directorate perspective, Elasticsearch is seen as a system that can accommodate the growth
that Vinted is expecting or experiencing. So one cool trick that we implemented is that we test new functionality in terms of performance before deploying it into Elasticsearch with the realistic data on the clusters of realistic size.
And we have, I mentioned, the team that is on duty, meaning that the team knows actually what workload Elasticsearch is having, how to, let's say, solve operational issues
not only by restarting the servers. So, yes, that's the scale of Elasticsearch installation at Vinted as of January 2021. Yeah, let's conclude my presentation.
So if you have a question, is everything perfect now, I would say no, because we still have issues. One of the primary issues is that Elasticsearch is the resource hungry, and if we want to maintain the strict operational requirements that we have, well, we have to have big clusters.
Version upgrades are still not perfect, and the good thing is that our offline cluster helps with that. We can upgrade the offline cluster and test if it handles the workload. Also, the machine learning engineers are not entirely happy with Elasticsearch despite that they use the data from Elasticsearch.
However, they deploy their research, hits re-ranking models outside of Elasticsearch, which brings operational complexity. We would like to have everything inside Elasticsearch because it would be less complicated.
And one complaint, mostly from my side, is that Elasticsearch default installation offers very few tools for search relevance work, meaning that we have to go somewhere else to find the tools or write them ourselves. This one. And since we are, I would say,
big users of Elasticsearch, we are the likely targets that would catch the errors or bugs in Elasticsearch. So from time to time, we report back our findings. So, yeah, that's that. Yeah, so that was all I had for you today.
Thank you for listening, and I hope that you enjoyed the presentation and better than buzzwords in general. Yeah, thank you. All right, thanks for the talk there. Looks like we're just about on time,
but since there's no talk after this, there's one question that's coming, so maybe we could go with that one. The question is, thanks for the presentation. Do you get any search rejections during peak times,
and how do you cope with it? Do you set any auto-scaling policies? Yes, good question, thanks. In terms of auto-scaling policies, we have a limited opportunity to do that because we have an installation on our bare metal,
so we have to provision our clusters, let's say, half a year up front. So there is really no easy way to do auto-scaling. And in terms of rejections, well, as of now, Elasticsearch is operating in such a way that the most problems it causes
is that when the request times out, we just retry it, and most of the time it succeeds because we have enough, let's say, metal to handle the workload, and we optimized stuff for query latency. So rejections as of now are not a problem.
However, some time ago during the peak hours, our Elasticsearch cluster was on fire, and there were some rejections. But I hope that we have solved this problem, and we have more tricks to take out of our sleeves
to make the situation even better. Awesome. I haven't seen any... Okay, I see one more question coming, but before that, I'd like to just let everybody know that there's the spatial lounge that a lot of folks are hanging out at. So once you're done with this talk,
feel free to hang out there. Let me read out... Okay, looks like three more questions came in. I'm going to read them out one by one. So the first question there is, in general, what is your sharding strategy for best performance or capacity?
Few big shards or multiple small primary shards? Okay, so about sharding. As I mentioned in my presentation, a year ago we had a sharding strategy
which was we had 84 primary shards with four replicas. So we had many, many shards, and at that time it was kind of just add more shards, and the query latency will not go up as our document count increases. However, this at some point stopped working
because, well, it's a limited strategy, and as of now we have a little, let's say, less shards, which is 36, and how we came up with that number. So as I mentioned, we have an offline cluster
where we can just replay or route the production traffic and see how it goes, meaning collect metrics. So we tried multiple sharding strategies, having less primary shards, more replicas, or having more primary shards and less replicas
because we tried to have as many shards for index as we have data nodes. So this is the upper limit of how many shards in total we would like to have. And at that point in time, it seemed that 36 shards is just a sweet spot,
that it does not create too much indexing overhead on the cluster, and it handles the queries with a low latency. So that's that. And, yes, so we have a couple of rules of thumb
for our shards when we are considering now how many of shards should we have next month, and we have, like, a rule of thumb that the shards should not have, like, more than 7 or 10 million of documents, and then the shards should not be bigger than, let's say, 10 gigabytes.
Awesome. Okay, next question for you is, you mentioned partitioning the index by types of query. Can you elaborate on what level these queries are different? Is it at the aggregation level, the kind of data do they access?
Yeah, so we have, like I mentioned in the presentation, we have many splits of that index. One example I've mentioned in the presentation that we have just a regular full-text search index, and then we have an index dedicated to handling
basically one type of aggregations. However, we have other splits. For example, the third split would be for handling the count queries, you know, that users can, for example, users can register a query,
and then when the, let's say, open the app, after a couple of days, they want to know how many new items they were uploaded during the time of last couple of days. So for such workload, we also have a separate index. Also, we split the index by, like, one index
is for having the queries into Elasticsearch that has actually no text, and we changed the indexing structure completely. Instead of sharding, we created multiple indices, time-based indices, to handle such a workload, while a full-text search is handled by an index.
Well, the same structure, it's just the queries are different. And different in a very small way, that it doesn't have a filter on or any boosts on text matching. Yeah, so thank you for the question. All right. Looks like the questions are flowing in,
so if you wish to answer them here, like, or if you want to hang out at the channel, like, breakout room, we can do that, or I can read out the next couple of questions. Oh, we can do questions now. All right. Okay, the next question is,
have you considered having one index per country? Would that increase the request cache hits? Yeah, good question. Thanks. So we have a mixture of both.
Since, well, Vinted, some time ago, operated in such a way that it opened up a new, let's say, portal per country. So at that point in time, all the indices per country, let's say, were separate, despite the fact that they were in the same cluster,
but the indices were separate. However, the company moved to the direction that we want to, let's say, join the markets. Let's say that Germany can shop from the, let's say, France, and France from Germany. Basically, this is the idea.
And when you think about how to accommodate such a workload, at the same time, you have to search in both catalogs. So this means that the data should somehow get close to each other, meaning basically the same index. Of course, we don't have all the data in one index.
For example, we have splits also by language. So the German data, German items are in one index, French are in second index. So to deal with our version of this long answer,
we had such a setup, but due to business circumstances, we had to transition ourselves into, let's say, a combined index per workload, not per country. Cool. And last question coming in is,
how much indexing traffic do these clusters handle roughly? Good question. Thanks. So, of course, it fluctuates. It depends on the, let's say, seasons, or the weekday, or the time.
But roughly, it's around 5,000 updates per second to our Elasticsearch clusters. And so this is the indexing of, let's say, throughput. And as you know, all the additional replicas in Elasticsearch cause additional indexing,
meaning that if you have more replicas, the same data should be indexed multiple times. So this indexing volume and workload is that one.